[jira] [Commented] (HBASE-18601) Update Htrace to 4.2

2017-11-27 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268344#comment-16268344
 ] 

Chia-Ping Tsai commented on HBASE-18601:


bq. TraceTree class comes from HTrace 3.2.0, but it has been removed in 4.2.0. 
I noticed the comparator duplication, but I tried to minimize the number of 
modifications at all.
Thanks [~balazs.meszaros] for the pointer.

> Update Htrace to 4.2
> 
>
> Key: HBASE-18601
> URL: https://issues.apache.org/jira/browse/HBASE-18601
> Project: HBase
>  Issue Type: Improvement
>  Components: dependencies, tracing
>Affects Versions: 2.0.0, 3.0.0
>Reporter: Tamas Penzes
>Assignee: Balazs Meszaros
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18601.master.001.patch, 
> HBASE-18601.master.002.patch, HBASE-18601.master.003 (3).patch, 
> HBASE-18601.master.003.patch, HBASE-18601.master.004.patch, 
> HBASE-18601.master.004.patch, HBASE-18601.master.005.patch, 
> HBASE-18601.master.006.patch, HBASE-18601.master.006.patch, 
> HBASE-18601.master.007.patch, HBASE-18601.master.007.patch, 
> HBASE-18601.master.007.patch, HBASE-18601.master.008.patch, 
> HBASE-18601.master.009.patch, HBASE-18601.master.009.patch, 
> HBASE-18601.master.010.patch, HBASE-18601.master.010.patch, 
> HBASE-18601.master.011.patch, HBASE-18601.master.012.patch, 
> HBASE-18601.master.013.patch, HBASE-18601.master.014.patch, 
> HBASE-18601.master.014.patch, HBASE-18601.master.015.patch, 
> HBASE-18601.master.016.patch
>
>
> HTrace is not perfectly integrated into HBase, the version 3.2.0 is buggy, 
> the upgrade to 4.x is not trivial and would take time. It might not worth to 
> keep it in this state, so would be better to remove it.
> Of course it doesn't mean tracing would be useless, just that in this form 
> the use of HTrace 3.2 might not add any value to the project and fixing it 
> would be far too much effort.
> -
> Based on the decision of the community we keep htrace now and update version



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19252) Move the transform logic of FilterList into transformCell() method to avoid extra ref to question cell

2017-11-27 Thread Zheng Hu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated HBASE-19252:
-
Attachment: HBASE-19252-branch-1.4.v1.patch

re-attach to see the Hadoop QA result for branch-1.4. 

> Move the transform logic of FilterList into transformCell() method to avoid 
> extra ref to question cell 
> ---
>
> Key: HBASE-19252
> URL: https://issues.apache.org/jira/browse/HBASE-19252
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Minor
> Fix For: 3.0.0, 1.4.1, 2.0.0-beta-1
>
> Attachments: HBASE-19252-branch-1.4.v1.patch, 
> HBASE-19252-branch-1.4.v1.patch, HBASE-19252.v1.patch, HBASE-19252.v2.patch, 
> HBASE-19252.v3.patch, HBASE-19252.v4.patch
>
>
> As [~anoop.hbase] and I discussed,  we can implement the filterKeyValue () 
> and transformCell() methods as following  to avoid saving transformedCell & 
> referenceCell state in FilterList, and we can avoid the costly cell clone. 
> {code}
> ReturnCode filterKeyValue(Cell c){
>   ReturnCode rc = null;
>   for(Filter filter: sub-filters){
>   // ...
>   rc = mergeReturnCode(rc, filter.filterKeyValue(c));
>   // ... 
>   }
>   return rc;
> }
> Cell transformCell(Cell c) throws IOException {
>   Cell transformed = c; 
>   for(Filter filter: sub-filters){
>   if(filter.filterKeyValue(c) is INCLUDE*) { //  > line#1
>   transformed = filter.transformCell(transformed);
> 
>   }
>   }
>   return transformed; 
> }
> {code}
> For line #1,  we need to remember the return code of the sub-filter for its 
> filterKeyValue().  because only INCLUDE*  ReturnCode,   we need to 
> transformCell for sub-filter.  
> A new boolean array will be introduced in FilterList.  and the cost of 
> maintaining  the boolean array will be less than  the cost of maintaining the 
> two ref of question cell. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268306#comment-16268306
 ] 

Hadoop QA commented on HBASE-18090:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1.3 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
 4s{color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} branch-1.3 passed with JDK v1.8.0_152 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} branch-1.3 passed with JDK v1.7.0_161 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
43s{color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
59s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} branch-1.3 passed with JDK v1.8.0_152 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} branch-1.3 passed with JDK v1.7.0_161 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed with JDK v1.8.0_152 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed with JDK v1.7.0_161 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
16s{color} | {color:red} hbase-server: The patch generated 40 new + 246 
unchanged - 1 fixed = 286 total (was 247) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
18s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
26m 44s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3. 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed with JDK v1.8.0_152 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed with JDK v1.7.0_161 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 22m 45s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
20s{color} | {color:green} hbase-it in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate 

[jira] [Commented] (HBASE-18233) We shouldn't wait for readlock in doMiniBatchMutation in case of deadlock

2017-11-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268307#comment-16268307
 ] 

Hadoop QA commented on HBASE-18233:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1.2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
24s{color} | {color:green} branch-1.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} branch-1.2 passed with JDK v1.8.0_152 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} branch-1.2 passed with JDK v1.7.0_161 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} branch-1.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
45s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} branch-1.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} branch-1.2 passed with JDK v1.8.0_152 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} branch-1.2 passed with JDK v1.7.0_161 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed with JDK v1.8.0_152 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed with JDK v1.7.0_161 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
19s{color} | {color:red} hbase-server: The patch generated 4 new + 399 
unchanged - 0 fixed = 403 total (was 399) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
21s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
20m 41s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3. 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed with JDK v1.8.0_152 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed with JDK v1.7.0_161 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 77m 
44s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}115m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:e77c578 |
| JIRA Issue | HBASE-18233 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12899546/HBASE-18233-branch-1.2.v5%20%281%29.patch
 |
| Optional Tests |  

[jira] [Commented] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268304#comment-16268304
 ] 

Yu Li commented on HBASE-19358:
---

Would be great to know:
1. How to decide the value of 
{{hbase.regionserver.hlog.splitlog.writer.threads}}, or how to take full usage 
of HDFS capacity meantime don't overload.
2. The performance number, or say effect on recovering time before/after the 
patch.

And the same question for your similar JIRA boss, if a similar design (smile) 
[~Apache9]

> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
> Attachments: newLogic.jpg, previousLogic.jpg
>
>
> The way we splitting log now is like the following figure:
> !https://issues.apache.org/jira/secure/attachment/12899558/previousLogic.jpg!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !https://issues.apache.org/jira/secure/attachment/12899557/newLogic.jpg!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds *_hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads_*, but before it is 
> *_hbase.regionserver.wal.max.splitters * the number of region the hlog 
> contains_*.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18601) Update Htrace to 4.2

2017-11-27 Thread Balazs Meszaros (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268282#comment-16268282
 ] 

Balazs Meszaros commented on HBASE-18601:
-

[~chia7712] {{TraceTree}} class comes from HTrace 3.2.0, but it has been 
removed in 4.2.0. I noticed the comparator duplication, but I tried to minimize 
the number of modifications at all.

> Update Htrace to 4.2
> 
>
> Key: HBASE-18601
> URL: https://issues.apache.org/jira/browse/HBASE-18601
> Project: HBase
>  Issue Type: Improvement
>  Components: dependencies, tracing
>Affects Versions: 2.0.0, 3.0.0
>Reporter: Tamas Penzes
>Assignee: Balazs Meszaros
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18601.master.001.patch, 
> HBASE-18601.master.002.patch, HBASE-18601.master.003 (3).patch, 
> HBASE-18601.master.003.patch, HBASE-18601.master.004.patch, 
> HBASE-18601.master.004.patch, HBASE-18601.master.005.patch, 
> HBASE-18601.master.006.patch, HBASE-18601.master.006.patch, 
> HBASE-18601.master.007.patch, HBASE-18601.master.007.patch, 
> HBASE-18601.master.007.patch, HBASE-18601.master.008.patch, 
> HBASE-18601.master.009.patch, HBASE-18601.master.009.patch, 
> HBASE-18601.master.010.patch, HBASE-18601.master.010.patch, 
> HBASE-18601.master.011.patch, HBASE-18601.master.012.patch, 
> HBASE-18601.master.013.patch, HBASE-18601.master.014.patch, 
> HBASE-18601.master.014.patch, HBASE-18601.master.015.patch, 
> HBASE-18601.master.016.patch
>
>
> HTrace is not perfectly integrated into HBase, the version 3.2.0 is buggy, 
> the upgrade to 4.x is not trivial and would take time. It might not worth to 
> keep it in this state, so would be better to remove it.
> Of course it doesn't mean tracing would be useless, just that in this form 
> the use of HTrace 3.2 might not add any value to the project and fixing it 
> would be far too much effort.
> -
> Based on the decision of the community we keep htrace now and update version



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268281#comment-16268281
 ] 

Duo Zhang commented on HBASE-19358:
---

I think I have already filed a similar issue before... Let me find it...

> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
> Attachments: newLogic.jpg, previousLogic.jpg
>
>
> The way we splitting log now is like the following figure:
> !https://issues.apache.org/jira/secure/attachment/12899558/previousLogic.jpg!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !https://issues.apache.org/jira/secure/attachment/12899557/newLogic.jpg!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds *_hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads_*, but before it is 
> *_hbase.regionserver.wal.max.splitters * the number of region the hlog 
> contains_*.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19301) Provide way for CPs to create short circuited connection with custom configurations

2017-11-27 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268274#comment-16268274
 ] 

Guanghao Zhang commented on HBASE-19301:


bq. Not just doc.. 
Yes. I mean we can only fix the javadoc here. And focus other problems in new 
open issues.

> Provide way for CPs to create short circuited connection with custom 
> configurations
> ---
>
> Key: HBASE-19301
> URL: https://issues.apache.org/jira/browse/HBASE-19301
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19301-addendum.patch, HBASE-19301.patch, 
> HBASE-19301_V2.patch, HBASE-19301_V2.patch
>
>
> Over in HBASE-18359 we have discussions for this.
> Right now HBase provide getConnection() in RegionCPEnv, MasterCPEnv etc. But 
> this returns a pre created connection (per server).  This uses the configs at 
> hbase-site.xml at that server. 
> Phoenix needs creating connection in CP with some custom configs. Having this 
> custom changes in hbase-site.xml is harmful as that will affect all 
> connections been created at that server.
> This issue is for providing an overloaded getConnection(Configuration) API



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-15970) Move Replication Peers into an HBase table too

2017-11-27 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268260#comment-16268260
 ] 

Zheng Hu edited comment on HBASE-15970 at 11/28/17 7:11 AM:


bq. Would hbase:replication be used for this tracking ? 

I think we can save both replication queues and  peer states in 
hbase:replication meta table.For replication queues,  we store the queues 
with a rowkey:  queueId-,  For peer states, we can store the peer 
states with another rowkey which is a different format, such as  
peerId-peer-info , or other format like that , and  we store the peer state and 
peer config in a difference column: p:s  , p:c   (p means peer,  s means 
state,  c means configuration ). 

Besides,  we may need procedures to implement a  table based ReplicationPeers 
for notifying the RegionServer when add/remove peers  or update peer config. 

I'd like to prepare the patch for it. 


was (Author: openinx):
bq. Would hbase:replication be used for this tracking ? 

I think we  save both replication queues and  peer states in hbase:replication 
meta table.For replication queues,  we store the queues with a rowkey:  
queueId-,  For peer states, we can store the peer states with 
another rowkey which is a different format, such as  peerId-peer-info , or 
other format like that , and  we store the peer state and peer config in a 
difference column: p:s  , p:c   (p means peer,  s means state,  c means 
configuration ). 

Besides,  we may need procedures to implement a  table based ReplicationPeers 
for notifying the RegionServer when add/remove peers  or update peer config. 

I'd like to prepare the patch for it. 

> Move Replication Peers into an HBase table too
> --
>
> Key: HBASE-15970
> URL: https://issues.apache.org/jira/browse/HBASE-15970
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Reporter: Joseph
>Assignee: Zheng Hu
>
> Currently ReplicationQueuesHBaseTableImpl relies on ReplicationStateZkImpl to 
> track information about the available replication peers (used during 
> claimQueues). We can also move this into an HBase table instead of relying on 
> ZooKeeper



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19351) Deprecated is missing in Table implementations

2017-11-27 Thread Peter Somogyi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268264#comment-16268264
 ] 

Peter Somogyi commented on HBASE-19351:
---

Just checked, javadoc has the Deprecation tags.

> Deprecated is missing in Table implementations
> --
>
> Key: HBASE-19351
> URL: https://issues.apache.org/jira/browse/HBASE-19351
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-4
>Reporter: Peter Somogyi
>Assignee: Peter Somogyi
>Priority: Minor
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19351.master.001.patch
>
>
> Table interface has some deprecated methods and the implementations do not 
> have it so those methods are visible as non-deprecated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-15970) Move Replication Peers into an HBase table too

2017-11-27 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268260#comment-16268260
 ] 

Zheng Hu commented on HBASE-15970:
--

bq. Would hbase:replication be used for this tracking ? 

I think we  save both replication queues and  peer states in hbase:replication 
meta table.For replication queues,  we store the queues with a rowkey:  
queueId-,  For peer states, we can store the peer states with 
another rowkey which is a different format, such as  peerId-peer-info , or 
other format like that , and  we store the peer state and peer config in a 
difference column: p:s  , p:c   (p means peer,  s means state,  c means 
configuration ). 

Besides,  we may need procedures to implement a  table based ReplicationPeers 
for notifying the RegionServer when add/remove peers  or update peer config. 

I'd like to prepare the patch for it. 

> Move Replication Peers into an HBase table too
> --
>
> Key: HBASE-15970
> URL: https://issues.apache.org/jira/browse/HBASE-15970
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Reporter: Joseph
>Assignee: Zheng Hu
>
> Currently ReplicationQueuesHBaseTableImpl relies on ReplicationStateZkImpl to 
> track information about the available replication peers (used during 
> claimQueues). We can also move this into an HBase table instead of relying on 
> ZooKeeper



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19349) Introduce wrong version depencency of servlet-api jar

2017-11-27 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268255#comment-16268255
 ] 

Guanghao Zhang commented on HBASE-19349:


{code}
  
{code}
I saw this comment in pom.xml. This comment seems outdated and can be removed 
too? [~stack]

> Introduce wrong version depencency of servlet-api jar
> -
>
> Key: HBASE-19349
> URL: https://issues.apache.org/jira/browse/HBASE-19349
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-beta-1
>Reporter: Guanghao Zhang
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
>
> Build a tarball.
> {code}
> mvn -DskipTests clean install && mvn -DskipTests package assembly:single
> tar zxvf hbase-2.0.0-beta-1-SNAPSHOT-bin.tar.gz
> {code}
> Then I found there is a servlet-api-2.5.jar in the lib directory. The right 
> depencency should be javax.servlet-api-3.1.0.jar.
> Start a distributed cluster with this tarball. And got exception when access 
> Master/RS info jsp.
> {code}
> 2017-11-27,10:02:05,066 WARN org.eclipse.jetty.server.HttpChannel: /
> java.lang.NoSuchMethodError: 
> javax.servlet.http.HttpServletRequest.isAsyncSupported()Z
> at 
> org.eclipse.jetty.server.ResourceService.sendData(ResourceService.java:689)
> at 
> org.eclipse.jetty.server.ResourceService.doGet(ResourceService.java:294)
> at 
> org.eclipse.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:458)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:841)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1650)
> at 
> org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:113)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
> at 
> org.apache.hadoop.hbase.http.ClickjackingPreventionFilter.doFilter(ClickjackingPreventionFilter.java:48)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
> at 
> org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1374)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
> at 
> org.apache.hadoop.hbase.http.NoCacheFilter.doFilter(NoCacheFilter.java:49)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
> at 
> org.apache.hadoop.hbase.http.NoCacheFilter.doFilter(NoCacheFilter.java:49)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
> {code}
> Try mvn depencency:tree but didn't find why servlet-api-2.5.jar was 
> introduced.
> I download hbase-2.0.0-alpha4-bin.tar.gz and didn't find servlet-api-2.5.jar. 
> And build a tar from hbase-2.0.0-alpha4-src.tar.gz and didn't find 
> servlet-api-2.5.jar, too. So this may be introduced by recently commits. And 
> should fix this when release 2.0.0-beta1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19301) Provide way for CPs to create short circuited connection with custom configurations

2017-11-27 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268256#comment-16268256
 ] 

Anoop Sam John commented on HBASE-19301:


Not just doc..  This will be really strange. Again one more catch is there. 
Even when the short circuited connection is targeting this server, the user can 
be hbase super user. (As we have code in ACL) This is when the execution is 
handed over to another thread , which is part of a pool.  Totally strange for a 
normal user.  IMHO  we must some how solve this.

> Provide way for CPs to create short circuited connection with custom 
> configurations
> ---
>
> Key: HBASE-19301
> URL: https://issues.apache.org/jira/browse/HBASE-19301
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19301-addendum.patch, HBASE-19301.patch, 
> HBASE-19301_V2.patch, HBASE-19301_V2.patch
>
>
> Over in HBASE-18359 we have discussions for this.
> Right now HBase provide getConnection() in RegionCPEnv, MasterCPEnv etc. But 
> this returns a pre created connection (per server).  This uses the configs at 
> hbase-site.xml at that server. 
> Phoenix needs creating connection in CP with some custom configs. Having this 
> custom changes in hbase-site.xml is harmful as that will affect all 
> connections been created at that server.
> This issue is for providing an overloaded getConnection(Configuration) API



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HBASE-15970) Move Replication Peers into an HBase table too

2017-11-27 Thread Zheng Hu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu reassigned HBASE-15970:


Assignee: Zheng Hu

> Move Replication Peers into an HBase table too
> --
>
> Key: HBASE-15970
> URL: https://issues.apache.org/jira/browse/HBASE-15970
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Reporter: Joseph
>Assignee: Zheng Hu
>
> Currently ReplicationQueuesHBaseTableImpl relies on ReplicationStateZkImpl to 
> track information about the available replication peers (used during 
> claimQueues). We can also move this into an HBase table instead of relying on 
> ZooKeeper



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Jingyun Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-19358:
-
Description: 
The way we splitting log now is like the following figure:
!https://issues.apache.org/jira/secure/attachment/12899558/previousLogic.jpg!
The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!https://issues.apache.org/jira/secure/attachment/12899557/newLogic.jpg!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds *_hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads_*, but before it is 
*_hbase.regionserver.wal.max.splitters * the number of region the hlog 
contains_*.


  was:
Now the way we split log is like the following figure:
!https://issues.apache.org/jira/secure/attachment/12899558/previousLogic.jpg!
The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!https://issues.apache.org/jira/secure/attachment/12899557/newLogic.jpg!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds *_hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads_*, but before it is 
*_hbase.regionserver.wal.max.splitters * the number of region the hlog 
contains_*.



> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
> Attachments: newLogic.jpg, previousLogic.jpg
>
>
> The way we splitting log now is like the following figure:
> !https://issues.apache.org/jira/secure/attachment/12899558/previousLogic.jpg!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !https://issues.apache.org/jira/secure/attachment/12899557/newLogic.jpg!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds *_hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads_*, but before it is 
> *_hbase.regionserver.wal.max.splitters * the number of region the hlog 
> contains_*.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19096) Add RowMutions batch support in AsyncTable

2017-11-27 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268249#comment-16268249
 ] 

Guanghao Zhang commented on HBASE-19096:


bq. I guess they are passed in so that they are re-used to save some new 
allocation of the builders.
Ok...
bq. Add a 's' to the name. Like 'buildRegionActions'?
Now what the method do is add a list of region actions to a multi request 
builder. buildRegionActions is good :-) (I can't find a better name...)

> Add RowMutions batch support in AsyncTable
> --
>
> Key: HBASE-19096
> URL: https://issues.apache.org/jira/browse/HBASE-19096
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jerry He
>Assignee: Jerry He
> Fix For: 2.0.0
>
> Attachments: HBASE-19096-master-v2.patch, 
> HBASE-19096-master-v3.patch, HBASE-19096-master.patch
>
>
> Batch support for RowMutations has been added in the Table interface, but is 
> not in AsyncTable. This JIRA will add it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Jingyun Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-19358:
-
Description: 
Now the way we split log is like the following figure:
!https://issues.apache.org/jira/secure/attachment/12899558/previousLogic.jpg!
The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!https://issues.apache.org/jira/secure/attachment/12899557/newLogic.jpg!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds *_hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads_*, but before it is 
*_hbase.regionserver.wal.max.splitters * the number of region the hlog 
contains_*.


  was:
Now the way we split log is like the following figure:
!https://issues.apache.org/jira/secure/attachment/12899558/previousLogic.jpg!
The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!newLogic.jpg|thumbnail!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
> Attachments: newLogic.jpg, previousLogic.jpg
>
>
> Now the way we split log is like the following figure:
> !https://issues.apache.org/jira/secure/attachment/12899558/previousLogic.jpg!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !https://issues.apache.org/jira/secure/attachment/12899557/newLogic.jpg!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds *_hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads_*, but before it is 
> *_hbase.regionserver.wal.max.splitters * the number of region the hlog 
> contains_*.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Jingyun Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-19358:
-
Description: 
Now the way we split log is like the following figure:
!https://issues.apache.org/jira/secure/attachment/12899558/previousLogic.jpg!
The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!newLogic.jpg|thumbnail!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
hbase.regionserver.wal.max.splitters * the number of region the hlog contains.


  was:
Now the way we split log is like the following figure:

The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!newLogic.jpg|thumbnail!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
> Attachments: newLogic.jpg, previousLogic.jpg
>
>
> Now the way we split log is like the following figure:
> !https://issues.apache.org/jira/secure/attachment/12899558/previousLogic.jpg!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !newLogic.jpg|thumbnail!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
> hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Jingyun Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-19358:
-
Description: 
Now the way we split log is like the following figure:

The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!newLogic.jpg|thumbnail!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
hbase.regionserver.wal.max.splitters * the number of region the hlog contains.


  was:
Now the way we split log is like the following figure:
!https://issues.apache.org/jira/secure/attachment/12899557/newLogic.jpg!
The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!newLogic.jpg|thumbnail!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
> Attachments: newLogic.jpg, previousLogic.jpg
>
>
> Now the way we split log is like the following figure:
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !newLogic.jpg|thumbnail!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
> hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Jingyun Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-19358:
-
Description: 
Now the way we split log is like the following figure:
!https://issues.apache.org/jira/secure/attachment/12899557/newLogic.jpg!
The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!newLogic.jpg|thumbnail!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
hbase.regionserver.wal.max.splitters * the number of region the hlog contains.


  was:
Now the way we split log is like the following figure:
!previousLogic.jpg|thumbnail!
The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!newLogic.jpg|thumbnail!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
> Attachments: newLogic.jpg, previousLogic.jpg
>
>
> Now the way we split log is like the following figure:
> !https://issues.apache.org/jira/secure/attachment/12899557/newLogic.jpg!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !newLogic.jpg|thumbnail!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
> hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Jingyun Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-19358:
-
Description: 
Now the way we split log is like the following figure:
!previousLogic.jpg|thumbnail!
The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!newLogic.jpg|thumbnail!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
hbase.regionserver.wal.max.splitters * the number of region the hlog contains.


  was:
Now the way we split log is like the following figure:

The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!newLogic.png|thumbnail!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
> Attachments: newLogic.jpg, previousLogic.jpg
>
>
> Now the way we split log is like the following figure:
> !previousLogic.jpg|thumbnail!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !newLogic.jpg|thumbnail!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
> hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Jingyun Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-19358:
-
Attachment: previousLogic.jpg

> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
> Attachments: newLogic.jpg, previousLogic.jpg
>
>
> Now the way we split log is like the following figure:
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !newLogic.png|thumbnail!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
> hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19301) Provide way for CPs to create short circuited connection with custom configurations

2017-11-27 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268227#comment-16268227
 ] 

Guanghao Zhang commented on HBASE-19301:


bq. But the effective user as seen by the code paths may be different. When it 
is call to other server its is fine that the hbase super user will be the user. 
Because in RPC, the Connection user is getting passed. But when the target 
server is this, there is no RPC context and we are not even doing any reset of 
the old context and so the path uses the old context. So the user is showing as 
the initial RPC op user.
Great. This should be add to the javadoc.

> Provide way for CPs to create short circuited connection with custom 
> configurations
> ---
>
> Key: HBASE-19301
> URL: https://issues.apache.org/jira/browse/HBASE-19301
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19301-addendum.patch, HBASE-19301.patch, 
> HBASE-19301_V2.patch, HBASE-19301_V2.patch
>
>
> Over in HBASE-18359 we have discussions for this.
> Right now HBase provide getConnection() in RegionCPEnv, MasterCPEnv etc. But 
> this returns a pre created connection (per server).  This uses the configs at 
> hbase-site.xml at that server. 
> Phoenix needs creating connection in CP with some custom configs. Having this 
> custom changes in hbase-site.xml is harmful as that will affect all 
> connections been created at that server.
> This issue is for providing an overloaded getConnection(Configuration) API



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Jingyun Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-19358:
-
Attachment: (was: previoutLogic.jpg)

> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
> Attachments: newLogic.jpg
>
>
> Now the way we split log is like the following figure:
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !newLogic.png|thumbnail!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
> hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Jingyun Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-19358:
-
Description: 
Now the way we split log is like the following figure:

The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!newLogic.png|thumbnail!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
hbase.regionserver.wal.max.splitters * the number of region the hlog contains.


  was:
Now the way we split log is like the following figure:
!previous-logic.png|thumbnail!
The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!newLogic.png|thumbnail!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
> Attachments: newLogic.jpg, previoutLogic.jpg
>
>
> Now the way we split log is like the following figure:
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !newLogic.png|thumbnail!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
> hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Jingyun Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-19358:
-
Attachment: newLogic.jpg
previoutLogic.jpg

> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
> Attachments: newLogic.jpg, previoutLogic.jpg
>
>
> Now the way we split log is like the following figure:
> !previous-logic.png|thumbnail!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !newLogic.png|thumbnail!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
> hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19301) Provide way for CPs to create short circuited connection with custom configurations

2017-11-27 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268223#comment-16268223
 ] 

Anoop Sam John commented on HBASE-19301:


Ys.  We can not really say here the user.
On short circuited connection (Using getConnection API), the user in the 
Connection will be hbase super user always.  This is the case when the 
targetted server is this same or any other.  But the effective user as seen by 
the code paths may be different. When it is call to other server its is fine 
that the hbase super user will be the user. Because in RPC, the Connection user 
is getting passed.  But when the target server is this, there is no RPC context 
and we are not even doing any reset of the old context and so the path uses the 
old context. So the user is showing as the initial RPC op user.
Than just for user this brings some other possible issue also.  Say within a CP 
hook, we got the short circuit connection (Any of the API)  and then using that 
issuing a read request. (Say the original req was also a read req)  Now as said 
above, the old RPC conext is still getting used as there is no RPC involved in 
short circuited connection. We are doing accounting of the response cells size 
/ block size etc and this math accumulation happens over RpcConext instance. 
Means the short circuited get call's return  size also added to the old size 
accounting which can give wrong prediction on the actual initial RPC req.   
Said that, there should be some way to reset this RPC conext when the short 
circuited call happens.  And we get the RPC context and so the user info via 
ThreadLocal.  Adding more issue because some times the execution path gives job 
to another thread from a pool.  
This looks to be larger problem now?


> Provide way for CPs to create short circuited connection with custom 
> configurations
> ---
>
> Key: HBASE-19301
> URL: https://issues.apache.org/jira/browse/HBASE-19301
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19301-addendum.patch, HBASE-19301.patch, 
> HBASE-19301_V2.patch, HBASE-19301_V2.patch
>
>
> Over in HBASE-18359 we have discussions for this.
> Right now HBase provide getConnection() in RegionCPEnv, MasterCPEnv etc. But 
> this returns a pre created connection (per server).  This uses the configs at 
> hbase-site.xml at that server. 
> Phoenix needs creating connection in CP with some custom configs. Having this 
> custom changes in hbase-site.xml is harmful as that will affect all 
> connections been created at that server.
> This issue is for providing an overloaded getConnection(Configuration) API



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Jingyun Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-19358:
-
Attachment: (was: previous-logic.png)

> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
>
> Now the way we split log is like the following figure:
> !previous-logic.png|thumbnail!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !newLogic.png|thumbnail!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
> hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Jingyun Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-19358:
-
Attachment: (was: newLogic.png)

> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
>
> Now the way we split log is like the following figure:
> !previous-logic.png|thumbnail!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !newLogic.png|thumbnail!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
> hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19336) Improve rsgroup to allow assign all tables within a specified namespace by only writing namespace

2017-11-27 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268210#comment-16268210
 ] 

Guanghao Zhang commented on HBASE-19336:


[~xinxin fan] Please change the status to Patch Available to trigger Hadoop QA. 
Thanks.

> Improve rsgroup to allow assign all tables within a specified namespace by 
> only writing namespace
> -
>
> Key: HBASE-19336
> URL: https://issues.apache.org/jira/browse/HBASE-19336
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Affects Versions: 2.0.0-alpha-4
>Reporter: xinxin fan
>Assignee: xinxin fan
> Attachments: HBASE-19336-master-V2.patch, 
> HBASE-19336-master-V3.patch, HBASE-19336-master-V4.patch, 
> HBASE-19336-master-V4.patch, HBASE-19336-master.patch
>
>
> Currently, use can only assign tables within a namespace from one group to 
> another by writing all table names in move_tables_rsgroup command. Allowing 
> to assign all tables within a specifed namespace by only wirting namespace 
> name is useful.
> Usage as follows:
> {code:java}
> hbase(main):055:0> move_namespaces_rsgroup 'dest_rsgroup',['ns1']
> Took 2.2211 seconds
> {code}
> {code:java}
> hbase(main):051:0* move_servers_namespaces_rsgroup 
> 'dest_rsgroup',['hbase39.lt.163.org:60020'],['ns1','ns2']
> Took 15.3710 seconds 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Jingyun Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-19358:
-
Description: 
Now the way we split log is like the following figure:
!previous-logic.png|thumbnail!
The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!newLogic.png|thumbnail!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
hbase.regionserver.wal.max.splitters * the number of region the hlog contains.


  was:
Now the way we split log is like the following figure:
!previous-logic.png|thumbnail!
The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!attachment-name.jpg|thumbnail!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
> Attachments: newLogic.png, previous-logic.png
>
>
> Now the way we split log is like the following figure:
> !previous-logic.png|thumbnail!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !newLogic.png|thumbnail!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
> hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Jingyun Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-19358:
-
Attachment: newLogic.png
previous-logic.png

> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
> Attachments: newLogic.png, previous-logic.png
>
>
> Now the way we split log is like the following figure:
> !previous-logic.png|thumbnail!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !attachment-name.jpg|thumbnail!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
> hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19336) Improve rsgroup to allow assign all tables within a specified namespace by only writing namespace

2017-11-27 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-19336:
---
Attachment: HBASE-19336-master-V4.patch

Reattach for Hadoop QA.

> Improve rsgroup to allow assign all tables within a specified namespace by 
> only writing namespace
> -
>
> Key: HBASE-19336
> URL: https://issues.apache.org/jira/browse/HBASE-19336
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Affects Versions: 2.0.0-alpha-4
>Reporter: xinxin fan
>Assignee: xinxin fan
> Attachments: HBASE-19336-master-V2.patch, 
> HBASE-19336-master-V3.patch, HBASE-19336-master-V4.patch, 
> HBASE-19336-master-V4.patch, HBASE-19336-master.patch
>
>
> Currently, use can only assign tables within a namespace from one group to 
> another by writing all table names in move_tables_rsgroup command. Allowing 
> to assign all tables within a specifed namespace by only wirting namespace 
> name is useful.
> Usage as follows:
> {code:java}
> hbase(main):055:0> move_namespaces_rsgroup 'dest_rsgroup',['ns1']
> Took 2.2211 seconds
> {code}
> {code:java}
> hbase(main):051:0* move_servers_namespaces_rsgroup 
> 'dest_rsgroup',['hbase39.lt.163.org:60020'],['ns1','ns2']
> Took 15.3710 seconds 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19096) Add RowMutions batch support in AsyncTable

2017-11-27 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268204#comment-16268204
 ] 

Jerry He commented on HBASE-19096:
--

The regionActionBuilder, actionBuilder, and mutationBuilder have been there as 
parameters originally. I guess they are passed in so that they are re-used to 
save some new allocation of the builders.
Add a 's' to the name. Like 'buildRegionActions'? 

> Add RowMutions batch support in AsyncTable
> --
>
> Key: HBASE-19096
> URL: https://issues.apache.org/jira/browse/HBASE-19096
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jerry He
>Assignee: Jerry He
> Fix For: 2.0.0
>
> Attachments: HBASE-19096-master-v2.patch, 
> HBASE-19096-master-v3.patch, HBASE-19096-master.patch
>
>
> Batch support for RowMutations has been added in the Table interface, but is 
> not in AsyncTable. This JIRA will add it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19252) Move the transform logic of FilterList into transformCell() method to avoid extra ref to question cell

2017-11-27 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268203#comment-16268203
 ] 

Guanghao Zhang commented on HBASE-19252:


[~openinx] Reattach a patch for branch-1 to see the Hadoop QA result? Then we 
can close this issue after commit to branch-1. Thanks.

> Move the transform logic of FilterList into transformCell() method to avoid 
> extra ref to question cell 
> ---
>
> Key: HBASE-19252
> URL: https://issues.apache.org/jira/browse/HBASE-19252
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Minor
> Fix For: 3.0.0, 1.4.1, 2.0.0-beta-1
>
> Attachments: HBASE-19252-branch-1.4.v1.patch, HBASE-19252.v1.patch, 
> HBASE-19252.v2.patch, HBASE-19252.v3.patch, HBASE-19252.v4.patch
>
>
> As [~anoop.hbase] and I discussed,  we can implement the filterKeyValue () 
> and transformCell() methods as following  to avoid saving transformedCell & 
> referenceCell state in FilterList, and we can avoid the costly cell clone. 
> {code}
> ReturnCode filterKeyValue(Cell c){
>   ReturnCode rc = null;
>   for(Filter filter: sub-filters){
>   // ...
>   rc = mergeReturnCode(rc, filter.filterKeyValue(c));
>   // ... 
>   }
>   return rc;
> }
> Cell transformCell(Cell c) throws IOException {
>   Cell transformed = c; 
>   for(Filter filter: sub-filters){
>   if(filter.filterKeyValue(c) is INCLUDE*) { //  > line#1
>   transformed = filter.transformCell(transformed);
> 
>   }
>   }
>   return transformed; 
> }
> {code}
> For line #1,  we need to remember the return code of the sub-filter for its 
> filterKeyValue().  because only INCLUDE*  ReturnCode,   we need to 
> transformCell for sub-filter.  
> A new boolean array will be introduced in FilterList.  and the cost of 
> maintaining  the boolean array will be less than  the cost of maintaining the 
> two ref of question cell. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19242) Add MOB compact support for AsyncAdmin

2017-11-27 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-19242:
---
Attachment: HBASE-19242.master.002.patch

Reattach for Hadoop QA.

> Add MOB compact support for AsyncAdmin
> --
>
> Key: HBASE-19242
> URL: https://issues.apache.org/jira/browse/HBASE-19242
> Project: HBase
>  Issue Type: Sub-task
>  Components: Admin, mob
>Reporter: Duo Zhang
>Assignee: Balazs Meszaros
>Priority: Blocker
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19242.master.001.patch, 
> HBASE-19242.master.002.patch, HBASE-19242.master.002.patch
>
>
> {code}
>   private CompletableFuture compact(TableName tableName, byte[] 
> columnFamily, boolean major,
>   CompactType compactType) {
> if (CompactType.MOB.equals(compactType)) {
>   // TODO support MOB compact.
>   return failedFuture(new UnsupportedOperationException("MOB compact does 
> not support"));
> }
> {code}
> We need to support it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19354) [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151

2017-11-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268191#comment-16268191
 ] 

Hudson commented on HBASE-19354:


SUCCESS: Integrated in Jenkins build HBase-1.3-IT #296 (See 
[https://builds.apache.org/job/HBase-1.3-IT/296/])
HBASE-19354 [branch-1] Build using a jdk that is beyond ubuntu trusty's (stack: 
rev dca65353d54be8afbd69266e680f6bc621fd165e)
* (edit) dev-support/docker/Dockerfile


> [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151
> ---
>
> Key: HBASE-19354
> URL: https://issues.apache.org/jira/browse/HBASE-19354
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: stack
>Assignee: stack
> Fix For: 1.3.2, 1.2.7
>
> Attachments: HBASE-19354.branch-1.2.001.patch
>
>
> HDFS mini cluster hangs when it runs on openjdk 151. See parent issue where 
> [~chia7712] turns up the hang and then Xiao Chen confirms up in the parent 
> issue in a comment. Lets do what Xiao Chen suggests which comes from a Todd 
> approach over in kudu where we install newer versions of the jdk so we 
> by-pass the hdfs hangs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19301) Provide way for CPs to create short circuited connection with custom configurations

2017-11-27 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268190#comment-16268190
 ] 

Guanghao Zhang commented on HBASE-19301:


bq. the RPC user within the connection will always be the hbase super user who 
started the server process.
This is not right. The user should be the initial user who start the initial 
rpc operation.

> Provide way for CPs to create short circuited connection with custom 
> configurations
> ---
>
> Key: HBASE-19301
> URL: https://issues.apache.org/jira/browse/HBASE-19301
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19301-addendum.patch, HBASE-19301.patch, 
> HBASE-19301_V2.patch, HBASE-19301_V2.patch
>
>
> Over in HBASE-18359 we have discussions for this.
> Right now HBase provide getConnection() in RegionCPEnv, MasterCPEnv etc. But 
> this returns a pre created connection (per server).  This uses the configs at 
> hbase-site.xml at that server. 
> Phoenix needs creating connection in CP with some custom configs. Having this 
> custom changes in hbase-site.xml is harmful as that will affect all 
> connections been created at that server.
> This issue is for providing an overloaded getConnection(Configuration) API



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19358) Improve the stability of splitting log when do fail over

2017-11-27 Thread Jingyun Tian (JIRA)
Jingyun Tian created HBASE-19358:


 Summary: Improve the stability of splitting log when do fail over
 Key: HBASE-19358
 URL: https://issues.apache.org/jira/browse/HBASE-19358
 Project: HBase
  Issue Type: Improvement
  Components: MTTR
Affects Versions: 0.98.24
Reporter: Jingyun Tian


Now the way we split log is like the following figure:
!previous-logic.png|thumbnail!
The problem is the OutputSink will write the recovered edits during splitting 
log, which means it will create one WriterAndPath for each region. If the 
cluster is small and the number of regions per rs is large, it will create too 
many HDFS streams at the same time. Then it is prone to failure since each 
datanode need to handle too many streams.

Thus I come up with a new way to split log.  
!attachment-name.jpg|thumbnail!
We cached the recovered edits unless exceeds the memory limits we set or reach 
the end, then  we have a thread pool to do the rest things: write them to files 
and move to the destination.

The biggest benefit is we can control the number of streams we create during 
splitting log, 
it will not exceeds hbase.regionserver.wal.max.splitters * 
hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
hbase.regionserver.wal.max.splitters * the number of region the hlog contains.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19096) Add RowMutions batch support in AsyncTable

2017-11-27 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268185#comment-16268185
 ] 

Guanghao Zhang commented on HBASE-19096:


Nice refactor!
{code}
  * @param regionActionBuilder regionActionBuilder to be used to build region 
action.
  * @param actionBuilder actionBuilder to be used to build action.
  * @param mutationBuilder mutationBuilder to be used to build mutation.
{code}
Do we need these parameters? 
And maybe we need find a better method name for buildRegionAction and 
buildNoDataRegionAction?

> Add RowMutions batch support in AsyncTable
> --
>
> Key: HBASE-19096
> URL: https://issues.apache.org/jira/browse/HBASE-19096
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jerry He
>Assignee: Jerry He
> Fix For: 2.0.0
>
> Attachments: HBASE-19096-master-v2.patch, 
> HBASE-19096-master-v3.patch, HBASE-19096-master.patch
>
>
> Batch support for RowMutations has been added in the Table interface, but is 
> not in AsyncTable. This JIRA will add it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19301) Provide way for CPs to create short circuited connection with custom configurations

2017-11-27 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268182#comment-16268182
 ] 

stack commented on HBASE-19301:
---

This one is almost done then [~anoop.hbase]? Just add doc on commit sir?

> Provide way for CPs to create short circuited connection with custom 
> configurations
> ---
>
> Key: HBASE-19301
> URL: https://issues.apache.org/jira/browse/HBASE-19301
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19301-addendum.patch, HBASE-19301.patch, 
> HBASE-19301_V2.patch, HBASE-19301_V2.patch
>
>
> Over in HBASE-18359 we have discussions for this.
> Right now HBase provide getConnection() in RegionCPEnv, MasterCPEnv etc. But 
> this returns a pre created connection (per server).  This uses the configs at 
> hbase-site.xml at that server. 
> Phoenix needs creating connection in CP with some custom configs. Having this 
> custom changes in hbase-site.xml is harmful as that will affect all 
> connections been created at that server.
> This issue is for providing an overloaded getConnection(Configuration) API



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19354) [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151

2017-11-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268178#comment-16268178
 ] 

Hudson commented on HBASE-19354:


FAILURE: Integrated in Jenkins build HBase-1.3-JDK7 #356 (See 
[https://builds.apache.org/job/HBase-1.3-JDK7/356/])
HBASE-19354 [branch-1] Build using a jdk that is beyond ubuntu trusty's (stack: 
rev dca65353d54be8afbd69266e680f6bc621fd165e)
* (edit) dev-support/docker/Dockerfile


> [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151
> ---
>
> Key: HBASE-19354
> URL: https://issues.apache.org/jira/browse/HBASE-19354
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: stack
>Assignee: stack
> Fix For: 1.3.2, 1.2.7
>
> Attachments: HBASE-19354.branch-1.2.001.patch
>
>
> HDFS mini cluster hangs when it runs on openjdk 151. See parent issue where 
> [~chia7712] turns up the hang and then Xiao Chen confirms up in the parent 
> issue in a comment. Lets do what Xiao Chen suggests which comes from a Todd 
> approach over in kudu where we install newer versions of the jdk so we 
> by-pass the hdfs hangs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-27 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268175#comment-16268175
 ] 

stack commented on HBASE-18090:
---

branch-1.3.001 is a backport of the branch-1 patch. Lets see how it does after 
pushing HBASE-19354 on branch-1.3.

FYI [~xinxin fan], I tried bringing patch back to 1.2 but its a bit rough. 
Maybe you only intend it for branch-1.3+? No worries.

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.3.001.patch, HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19354) [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151

2017-11-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268174#comment-16268174
 ] 

Hudson commented on HBASE-19354:


FAILURE: Integrated in Jenkins build HBase-1.3-JDK8 #376 (See 
[https://builds.apache.org/job/HBase-1.3-JDK8/376/])
HBASE-19354 [branch-1] Build using a jdk that is beyond ubuntu trusty's (stack: 
rev dca65353d54be8afbd69266e680f6bc621fd165e)
* (edit) dev-support/docker/Dockerfile


> [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151
> ---
>
> Key: HBASE-19354
> URL: https://issues.apache.org/jira/browse/HBASE-19354
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: stack
>Assignee: stack
> Fix For: 1.3.2, 1.2.7
>
> Attachments: HBASE-19354.branch-1.2.001.patch
>
>
> HDFS mini cluster hangs when it runs on openjdk 151. See parent issue where 
> [~chia7712] turns up the hang and then Xiao Chen confirms up in the parent 
> issue in a comment. Lets do what Xiao Chen suggests which comes from a Todd 
> approach over in kudu where we install newer versions of the jdk so we 
> by-pass the hdfs hangs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-27 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-18090:
--
Attachment: HBASE-18090.branch-1.3.001.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.3.001.patch, HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19354) [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151

2017-11-27 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268170#comment-16268170
 ] 

stack commented on HBASE-19354:
---

I just pushed on branch-1.3 so I can try a run with HBASE-18090 (It won't come 
back to 1.2 easily).

> [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151
> ---
>
> Key: HBASE-19354
> URL: https://issues.apache.org/jira/browse/HBASE-19354
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: stack
>Assignee: stack
> Fix For: 1.3.2, 1.2.7
>
> Attachments: HBASE-19354.branch-1.2.001.patch
>
>
> HDFS mini cluster hangs when it runs on openjdk 151. See parent issue where 
> [~chia7712] turns up the hang and then Xiao Chen confirms up in the parent 
> issue in a comment. Lets do what Xiao Chen suggests which comes from a Todd 
> approach over in kudu where we install newer versions of the jdk so we 
> by-pass the hdfs hangs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19354) [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151

2017-11-27 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19354:
--
Fix Version/s: 1.2.7
   1.3.2

> [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151
> ---
>
> Key: HBASE-19354
> URL: https://issues.apache.org/jira/browse/HBASE-19354
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: stack
>Assignee: stack
> Fix For: 1.3.2, 1.2.7
>
> Attachments: HBASE-19354.branch-1.2.001.patch
>
>
> HDFS mini cluster hangs when it runs on openjdk 151. See parent issue where 
> [~chia7712] turns up the hang and then Xiao Chen confirms up in the parent 
> issue in a comment. Lets do what Xiao Chen suggests which comes from a Todd 
> approach over in kudu where we install newer versions of the jdk so we 
> by-pass the hdfs hangs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19342) fix TestTableBasedReplicationSourceManagerImpl#testRemovePeerMetricsCleanup

2017-11-27 Thread Ashu Pachauri (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268162#comment-16268162
 ] 

Ashu Pachauri commented on HBASE-19342:
---

+1 Thanks for fixing this [~chia7712]

> fix TestTableBasedReplicationSourceManagerImpl#testRemovePeerMetricsCleanup
> ---
>
> Key: HBASE-19342
> URL: https://issues.apache.org/jira/browse/HBASE-19342
> Project: HBase
>  Issue Type: Test
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Attachments: HBASE-19342.v0.patch
>
>
> It is number one in [flaky 
> tests|https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19357) Bucket cache no longer L2 for LRU cache

2017-11-27 Thread Anoop Sam John (JIRA)
Anoop Sam John created HBASE-19357:
--

 Summary: Bucket cache no longer L2 for LRU cache
 Key: HBASE-19357
 URL: https://issues.apache.org/jira/browse/HBASE-19357
 Project: HBase
  Issue Type: Sub-task
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0-beta-1


When Bucket cache is used, by default we dont configure it as an L2 cache 
alone. The default setting is combined mode ON where the data blocks to Bucket 
cache and index/bloom blocks go to LRU cache. But there is a way to turn this 
off and make LRU as L1 and Bucket cache as a victim handler for L1. It will be 
just L2.   
After the off heap read path optimization Bucket cache is no longer slower 
compared to L1. We have test results on data sizes from 12 GB.  The Alibaba use 
case was also with 12 GB and they have observed a ~30% QPS improve over the LRU 
cache.
This issue is to remove the option for combined mode = false. So when Bucket 
cache is in use, data blocks will go to it only and LRU will get only index 
/meta/bloom blocks.   Bucket cache will no longer be configured as a victim 
handler for LRU.

Note : WHen external cache is in use, there only the L1 L2 thing comes. LRU 
will be L1 and external cache act as its L2. That make full sense.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19096) Add RowMutions batch support in AsyncTable

2017-11-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268149#comment-16268149
 ] 

Hadoop QA commented on HBASE-19096:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
19s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
28s{color} | {color:red} hbase-client: The patch generated 1 new + 126 
unchanged - 4 fixed = 127 total (was 130) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
49s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
50m 56s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
36s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 89m 
39s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}165m  4s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19096 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12899533/HBASE-19096-master-v3.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 5340072e1638 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 3863559b07 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| 

[jira] [Updated] (HBASE-18233) We shouldn't wait for readlock in doMiniBatchMutation in case of deadlock

2017-11-27 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-18233:
--
Attachment: HBASE-18233-branch-1.2.v5 (1).patch

Retry after committing HBASE-19354
[branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151

> We shouldn't wait for readlock in doMiniBatchMutation in case of deadlock
> -
>
> Key: HBASE-18233
> URL: https://issues.apache.org/jira/browse/HBASE-18233
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.7
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Blocker
> Fix For: 1.3.2, 1.4.1, 1.2.7
>
> Attachments: HBASE-18233-branch-1.2.patch, 
> HBASE-18233-branch-1.2.trivial.patch, HBASE-18233-branch-1.2.v2.patch, 
> HBASE-18233-branch-1.2.v3.patch, HBASE-18233-branch-1.2.v4 (1).patch, 
> HBASE-18233-branch-1.2.v4 (1).patch, HBASE-18233-branch-1.2.v4 (2).patch, 
> HBASE-18233-branch-1.2.v4.patch, HBASE-18233-branch-1.2.v4.patch, 
> HBASE-18233-branch-1.2.v4.patch, HBASE-18233-branch-1.2.v4.patch, 
> HBASE-18233-branch-1.2.v4.patch, HBASE-18233-branch-1.2.v4.patch, 
> HBASE-18233-branch-1.2.v4.patch, HBASE-18233-branch-1.2.v4.patch, 
> HBASE-18233-branch-1.2.v5 (1).patch, HBASE-18233-branch-1.2.v5.patch, 
> HBASE-18233-branch-1.2.v5.patch, HBASE-18233.branch-1.2.UT.log
>
>
> Please refer to the discuss in HBASE-18144
> https://issues.apache.org/jira/browse/HBASE-18144?focusedCommentId=16051701=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16051701



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-27 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268142#comment-16268142
 ] 

Ashish Singhi edited comment on HBASE-19320 at 11/28/17 5:30 AM:
-

We have faced this issue couple of times in our clusters with replication, were 
replication was getting stuck because of this (Base version of HBase was 
1.0.2)..
Possible cause,
If heavy writes to the source cluster are observed, RegionServers in the peer 
cluster are unable to replicate data over a long period of time either due to 
RegionServer shutdown, or DataNode faults, or network issues. When the 
RegionServer functionality is restored, it may receive a replication request 
with a lot of edits in it. Consequently, RegionServers in the peer cluster 
takes more time than hbase.rpc.replication.timeout to process the replication 
request. Once the time spent in processing the replication request exceeds the 
time limit, RegionServer in the source cluster will resend the request, 
incurring the possibility that the direct buffer memory (default size is 64 MB) 
gets too full to serve any requests further.

Logs in source cluster RS were something like this,
{noformat}
2016-05-26 10:12:00,367 | WARN  | regionserver/XX/XX:21302.replicationSource,33 
| Can't replicate because of an error on the remote cluster:  | 
org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint.replicate(HBaseInterClusterReplicationEndpoint.java:275)
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.ipc.RpcServer$CallQueueTooBigException):
 Call queue is full on /XX:21302, is hbase.ipc.server.max.callqueue.size too 
small?
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1277)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:223)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:324)
at 
org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.replicateWALEntry(AdminProtos.java:25690)
at 
org.apache.hadoop.hbase.protobuf.ReplicationProtbufUtil.replicateWALEntry(ReplicationProtbufUtil.java:79)
at 
org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint$Replicator.call(HBaseInterClusterReplicationEndpoint.java:381)
at 
org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint$Replicator.call(HBaseInterClusterReplicationEndpoint.java:364)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

In peer cluster RS like this,
{noformat}
2016-05-26 10:12:21,971 | INFO  | pool-5614-thread-1 | #2, waiting for 7232  
actions to finish | 
org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.waitUntilDone(AsyncProcess.java:1572)
2016-05-26 10:12:31,990 | INFO  | pool-5614-thread-1 | #2, waiting for 7232  
actions to finish | 
org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.waitUntilDone(AsyncProcess.java:1572)
2016-05-26 10:12:42,006 | INFO  | pool-5614-thread-1 | #2, waiting for 7232  
actions to finish | 
org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.waitUntilDone(AsyncProcess.java:1572)
2016-05-26 10:12:52,029 | INFO  | pool-5614-thread-1 | #2, waiting for 7232  
actions to finish | 
org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.waitUntilDone(AsyncProcess.java:1572)
{noformat}

The procedure we used to avoid this issue was,
Add -XX:MaxDirectMemorySize=1024m to the HBASE_OPTS in the 
$HBASE_CONF_DIR/hbase-env.sh of each RegionServer process in peer cluster and 
then restart all these processes.
After increasing MaxDirectMemorySize we have not faced this issue again, its 
been more than a year now.


was (Author: ashish singhi):
We have faced this issue couple of times in our clusters with replication, were 
replication was getting stuck because of this (Base version of HBase was 
1.0.2)..
Possible cause,
If heavy writes to the source cluster are observed, RegionServers in the peer 
cluster are unable to replicate data over a long period of time either due to 
RegionServer shutdown, or DataNode faults, or network issues. When the 
RegionServer functionality is restored, it may receive a replication request 
with a lot of edits in it. Consequently, RegionServers in the peer cluster 
takes more time than hbase.rpc.replication.timeout to process the replication 
request. Once the time spent in processing the replication request exceeds the 
time limit, 

[jira] [Commented] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-27 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268142#comment-16268142
 ] 

Ashish Singhi commented on HBASE-19320:
---

We have faced this issue couple of times in our clusters with replication, were 
replication was getting stuck because of this (Base version of HBase was 
1.0.2)..
Possible cause,
If heavy writes to the source cluster are observed, RegionServers in the peer 
cluster are unable to replicate data over a long period of time either due to 
RegionServer shutdown, or DataNode faults, or network issues. When the 
RegionServer functionality is restored, it may receive a replication request 
with a lot of edits in it. Consequently, RegionServers in the peer cluster 
takes more time than hbase.rpc.replication.timeout to process the replication 
request. Once the time spent in processing the replication request exceeds the 
time limit, RegionServer in the source cluster will resend the request, 
incurring the possibility that the direct buffer memory (default size is 64 MB) 
gets too full to serve any requests further.

Logs in source cluster RS were something like this,
{noformat}
2016-05-26 10:12:00,367 | WARN  | regionserver/XX/XX:21302.replicationSource,33 
| Can't replicate because of an error on the remote cluster:  | 
org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint.replicate(HBaseInterClusterReplicationEndpoint.java:275)
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.ipc.RpcServer$CallQueueTooBigException):
 Call queue is full on /192.168.154.55:21302, is 
hbase.ipc.server.max.callqueue.size too small?
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1277)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:223)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:324)
at 
org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.replicateWALEntry(AdminProtos.java:25690)
at 
org.apache.hadoop.hbase.protobuf.ReplicationProtbufUtil.replicateWALEntry(ReplicationProtbufUtil.java:79)
at 
org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint$Replicator.call(HBaseInterClusterReplicationEndpoint.java:381)
at 
org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint$Replicator.call(HBaseInterClusterReplicationEndpoint.java:364)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

In peer cluster RS like this,
{noformat}
2016-05-26 10:12:21,971 | INFO  | pool-5614-thread-1 | #2, waiting for 7232  
actions to finish | 
org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.waitUntilDone(AsyncProcess.java:1572)
2016-05-26 10:12:31,990 | INFO  | pool-5614-thread-1 | #2, waiting for 7232  
actions to finish | 
org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.waitUntilDone(AsyncProcess.java:1572)
2016-05-26 10:12:42,006 | INFO  | pool-5614-thread-1 | #2, waiting for 7232  
actions to finish | 
org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.waitUntilDone(AsyncProcess.java:1572)
2016-05-26 10:12:52,029 | INFO  | pool-5614-thread-1 | #2, waiting for 7232  
actions to finish | 
org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.waitUntilDone(AsyncProcess.java:1572)
{noformat}

The procedure we used to avoid this issue was,
Add -XX:MaxDirectMemorySize=1024m to the HBASE_OPTS in the 
${HBASE_HOME}/conf/hbase-env.sh of each RegionServer process in peer cluster 
and then restart all these processes.
After increasing MaxDirectMemorySize we have not faced this issue again, its 
been more than a year now.

> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: Screen Shot 2017-11-21 at 4.43.36 PM.png, Screen Shot 
> 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> 

[jira] [Comment Edited] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-27 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268142#comment-16268142
 ] 

Ashish Singhi edited comment on HBASE-19320 at 11/28/17 5:29 AM:
-

We have faced this issue couple of times in our clusters with replication, were 
replication was getting stuck because of this (Base version of HBase was 
1.0.2)..
Possible cause,
If heavy writes to the source cluster are observed, RegionServers in the peer 
cluster are unable to replicate data over a long period of time either due to 
RegionServer shutdown, or DataNode faults, or network issues. When the 
RegionServer functionality is restored, it may receive a replication request 
with a lot of edits in it. Consequently, RegionServers in the peer cluster 
takes more time than hbase.rpc.replication.timeout to process the replication 
request. Once the time spent in processing the replication request exceeds the 
time limit, RegionServer in the source cluster will resend the request, 
incurring the possibility that the direct buffer memory (default size is 64 MB) 
gets too full to serve any requests further.

Logs in source cluster RS were something like this,
{noformat}
2016-05-26 10:12:00,367 | WARN  | regionserver/XX/XX:21302.replicationSource,33 
| Can't replicate because of an error on the remote cluster:  | 
org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint.replicate(HBaseInterClusterReplicationEndpoint.java:275)
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.ipc.RpcServer$CallQueueTooBigException):
 Call queue is full on /XX:21302, is hbase.ipc.server.max.callqueue.size too 
small?
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1277)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:223)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:324)
at 
org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.replicateWALEntry(AdminProtos.java:25690)
at 
org.apache.hadoop.hbase.protobuf.ReplicationProtbufUtil.replicateWALEntry(ReplicationProtbufUtil.java:79)
at 
org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint$Replicator.call(HBaseInterClusterReplicationEndpoint.java:381)
at 
org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint$Replicator.call(HBaseInterClusterReplicationEndpoint.java:364)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

In peer cluster RS like this,
{noformat}
2016-05-26 10:12:21,971 | INFO  | pool-5614-thread-1 | #2, waiting for 7232  
actions to finish | 
org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.waitUntilDone(AsyncProcess.java:1572)
2016-05-26 10:12:31,990 | INFO  | pool-5614-thread-1 | #2, waiting for 7232  
actions to finish | 
org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.waitUntilDone(AsyncProcess.java:1572)
2016-05-26 10:12:42,006 | INFO  | pool-5614-thread-1 | #2, waiting for 7232  
actions to finish | 
org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.waitUntilDone(AsyncProcess.java:1572)
2016-05-26 10:12:52,029 | INFO  | pool-5614-thread-1 | #2, waiting for 7232  
actions to finish | 
org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.waitUntilDone(AsyncProcess.java:1572)
{noformat}

The procedure we used to avoid this issue was,
Add -XX:MaxDirectMemorySize=1024m to the HBASE_OPTS in the 
${HBASE_HOME}/conf/hbase-env.sh of each RegionServer process in peer cluster 
and then restart all these processes.
After increasing MaxDirectMemorySize we have not faced this issue again, its 
been more than a year now.


was (Author: ashish singhi):
We have faced this issue couple of times in our clusters with replication, were 
replication was getting stuck because of this (Base version of HBase was 
1.0.2)..
Possible cause,
If heavy writes to the source cluster are observed, RegionServers in the peer 
cluster are unable to replicate data over a long period of time either due to 
RegionServer shutdown, or DataNode faults, or network issues. When the 
RegionServer functionality is restored, it may receive a replication request 
with a lot of edits in it. Consequently, RegionServers in the peer cluster 
takes more time than hbase.rpc.replication.timeout to process the replication 
request. Once the time spent in processing the replication request exceeds the 
time limit, 

[jira] [Commented] (HBASE-19035) Miss metrics when coprocessor use region scanner to read data

2017-11-27 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268137#comment-16268137
 ] 

stack commented on HBASE-19035:
---

Here is a patch for branch-1.2 after the HBASE-19354 commit, which hopefully 
settles branch-1.2 builds. Lets see how it does.

> Miss metrics when coprocessor use region scanner to read data
> -
>
> Key: HBASE-19035
> URL: https://issues.apache.org/jira/browse/HBASE-19035
> Project: HBase
>  Issue Type: Bug
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19035.branch-1.001.patch, 
> HBASE-19035.branch-1.2.001.patch, HBASE-19035.branch-1.patch, 
> HBASE-19035.branch-1.patch, HBASE-19035.branch-1.patch, 
> HBASE-19035.branch-1.patch, HBASE-19035.master.001.patch, 
> HBASE-19035.master.002.patch, HBASE-19035.master.003.patch, 
> HBASE-19035.master.003.patch
>
>
> Region interface is exposed to coprocessor. So coprocessor use getScanner to 
> get a region scanner to read data. But the scan metrics was only updated in 
> region server level. So we will miss some scan metrics for the read from 
> coprocessor.
> || Region Operation || When to update requests metric ||
> | get | update read metric in nextRaw() |
> | put | update write metric in batchMutate() |
> | delete | update write metric in batchMutate() |
> | increment | update read metric by get() and  update write metric in 
> doDelta() |
> | append | update read metric by get() and  update write metric in doDelta() |
> | mutateRow | update write metric in processRowsWithLocks() |
> | mutateRowsWithLocks | update write metric in processRowsWithLocks() |
> | batchMutate | update write metric in batchMutate() |
> | checkAndMutate | update read metric by get() and  update write metric by 
> mutateRow() |
> | checkAndRowMutate | update read metric by get() and  update write metric by 
> doBatchMutate() |
> | processRowsWithLocks | update write metric in processRowsWithLocks() |
> 1. Move read requests to region level. Because RegionScanner exposed to CP.
> 2. Update write requests count in processRowsWithLocks. This was missed in 
> previous implemenation, too.
> 3. Remove requestRowActionCount in RSRpcServices. This metric can be computed 
> by region's readRequestsCount and writeRequestsCount.
> Upload to review board: https://reviews.apache.org/r/63579/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19035) Miss metrics when coprocessor use region scanner to read data

2017-11-27 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19035:
--
Attachment: HBASE-19035.branch-1.2.001.patch

> Miss metrics when coprocessor use region scanner to read data
> -
>
> Key: HBASE-19035
> URL: https://issues.apache.org/jira/browse/HBASE-19035
> Project: HBase
>  Issue Type: Bug
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19035.branch-1.001.patch, 
> HBASE-19035.branch-1.2.001.patch, HBASE-19035.branch-1.patch, 
> HBASE-19035.branch-1.patch, HBASE-19035.branch-1.patch, 
> HBASE-19035.branch-1.patch, HBASE-19035.master.001.patch, 
> HBASE-19035.master.002.patch, HBASE-19035.master.003.patch, 
> HBASE-19035.master.003.patch
>
>
> Region interface is exposed to coprocessor. So coprocessor use getScanner to 
> get a region scanner to read data. But the scan metrics was only updated in 
> region server level. So we will miss some scan metrics for the read from 
> coprocessor.
> || Region Operation || When to update requests metric ||
> | get | update read metric in nextRaw() |
> | put | update write metric in batchMutate() |
> | delete | update write metric in batchMutate() |
> | increment | update read metric by get() and  update write metric in 
> doDelta() |
> | append | update read metric by get() and  update write metric in doDelta() |
> | mutateRow | update write metric in processRowsWithLocks() |
> | mutateRowsWithLocks | update write metric in processRowsWithLocks() |
> | batchMutate | update write metric in batchMutate() |
> | checkAndMutate | update read metric by get() and  update write metric by 
> mutateRow() |
> | checkAndRowMutate | update read metric by get() and  update write metric by 
> doBatchMutate() |
> | processRowsWithLocks | update write metric in processRowsWithLocks() |
> 1. Move read requests to region level. Because RegionScanner exposed to CP.
> 2. Update write requests count in processRowsWithLocks. This was missed in 
> previous implemenation, too.
> 3. Remove requestRowActionCount in RSRpcServices. This metric can be computed 
> by region's readRequestsCount and writeRequestsCount.
> Upload to review board: https://reviews.apache.org/r/63579/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19354) [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151

2017-11-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268136#comment-16268136
 ] 

Hudson commented on HBASE-19354:


FAILURE: Integrated in Jenkins build HBase-1.2-IT #1022 (See 
[https://builds.apache.org/job/HBase-1.2-IT/1022/])
HBASE-19354 [branch-1] Build using a jdk that is beyond ubuntu trusty's (stack: 
rev e77c5787491045ab45884d4ccceb6c66f51efe7d)
* (edit) dev-support/docker/Dockerfile


> [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151
> ---
>
> Key: HBASE-19354
> URL: https://issues.apache.org/jira/browse/HBASE-19354
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: stack
>Assignee: stack
> Attachments: HBASE-19354.branch-1.2.001.patch
>
>
> HDFS mini cluster hangs when it runs on openjdk 151. See parent issue where 
> [~chia7712] turns up the hang and then Xiao Chen confirms up in the parent 
> issue in a comment. Lets do what Xiao Chen suggests which comes from a Todd 
> approach over in kudu where we install newer versions of the jdk so we 
> by-pass the hdfs hangs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19350) TestMetaWithReplicas is flaky in branch-1

2017-11-27 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19350:
--
Attachment: HBASE-19350.branch-1.v0.test.patch

Retry

> TestMetaWithReplicas is flaky in branch-1
> -
>
> Key: HBASE-19350
> URL: https://issues.apache.org/jira/browse/HBASE-19350
> Project: HBase
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 1.4.0, 2.0.0-beta-1
>
> Attachments: HBASE-19350.branch-1.v0.test.patch, 
> HBASE-19350.branch-1.v0.test.patch
>
>
> If the size of RegionsInTransition is zero, the list passed to 
> {{ClusterStatus}} will be null.
> {code:title=ClusterStatus.java}
> Set rit = null;
> if (!proto.getRegionsInTransitionList().isEmpty()) {
>   rit = new 
> HashSet(proto.getRegionsInTransitionList().size());
>   for (RegionInTransition region : proto.getRegionsInTransitionList()) {
> RegionState value = RegionState.convert(region.getRegionState());
> rit.add(value);
>   }
> }
> {code}
> It causes NPE if someone try to do the for-each work. The HBaseFsckRepair is 
> a real-life example.
> {code:title=HBaseFsckRepair.java}
> for (RegionState rs: 
> admin.getClusterStatus().getRegionsInTransition()) {
>   if (rs.getRegion().equals(region)) {
> inTransition = true;
> break;
>   }
> }
> {code}
> branch-2/master don't have this issue as the list of RegionsInTransition 
> passed to {{ClusterStatus}} never be null.
> {code:title=ProtobufUtil.java}
> List rit =
>   new ArrayList<>(proto.getRegionsInTransitionList().size());
> for (RegionInTransition region : proto.getRegionsInTransitionList()) {
>   RegionState value = RegionState.convert(region.getRegionState());
>   rit.add(value);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19350) TestMetaWithReplicas is flaky in branch-1

2017-11-27 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268129#comment-16268129
 ] 

stack commented on HBASE-19350:
---

+1 Let me retry scheduling.

> TestMetaWithReplicas is flaky in branch-1
> -
>
> Key: HBASE-19350
> URL: https://issues.apache.org/jira/browse/HBASE-19350
> Project: HBase
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 1.4.0, 2.0.0-beta-1
>
> Attachments: HBASE-19350.branch-1.v0.test.patch
>
>
> If the size of RegionsInTransition is zero, the list passed to 
> {{ClusterStatus}} will be null.
> {code:title=ClusterStatus.java}
> Set rit = null;
> if (!proto.getRegionsInTransitionList().isEmpty()) {
>   rit = new 
> HashSet(proto.getRegionsInTransitionList().size());
>   for (RegionInTransition region : proto.getRegionsInTransitionList()) {
> RegionState value = RegionState.convert(region.getRegionState());
> rit.add(value);
>   }
> }
> {code}
> It causes NPE if someone try to do the for-each work. The HBaseFsckRepair is 
> a real-life example.
> {code:title=HBaseFsckRepair.java}
> for (RegionState rs: 
> admin.getClusterStatus().getRegionsInTransition()) {
>   if (rs.getRegion().equals(region)) {
> inTransition = true;
> break;
>   }
> }
> {code}
> branch-2/master don't have this issue as the list of RegionsInTransition 
> passed to {{ClusterStatus}} never be null.
> {code:title=ProtobufUtil.java}
> List rit =
>   new ArrayList<>(proto.getRegionsInTransitionList().size());
> for (RegionInTransition region : proto.getRegionsInTransitionList()) {
>   RegionState value = RegionState.convert(region.getRegionState());
>   rit.add(value);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19328) Remove asked if splittable log messages

2017-11-27 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19328:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

I agree w/ you [~chia7712]  Added you as sign-off on the patch. Pushed to 
master and branch-2.

> Remove asked if splittable log messages
> ---
>
> Key: HBASE-19328
> URL: https://issues.apache.org/jira/browse/HBASE-19328
> Project: HBase
>  Issue Type: Task
>  Components: proc-v2
>Affects Versions: 3.0.0
>Reporter: Balazs Meszaros
>Assignee: Balazs Meszaros
>Priority: Minor
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19328.master.001.patch
>
>
> I have found this log message in HBase log:
> {code}
> 2017-11-22 11:16:54,133 INFO  
> [RpcServer.priority.FPBQ.Fifo.handler=5,queue=0,port=52586] 
> regionserver.HRegion(1309): ASKED IF SPLITTABLE true 
> 0a66d6e20801eec2c6cd1204fedde592
> java.lang.Throwable: LOGGING: REMOVE
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.isSplittable(HRegion.java:1310)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1665)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:28159)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:325)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:305)
> {code}
> Still we need this?
> It was introduced in commit {{dc1065a85}} by [~stack] and [~mbertozzi].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19320) document the mysterious direct memory leak in hbase

2017-11-27 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268124#comment-16268124
 ] 

Anoop Sam John commented on HBASE-19320:


bq. I think Anoop mentioned the read path, how about the write path? 
>From 2.0 yes the write path also uses DBB from pool.  No on demand on heap 
>byte[] creation and so DBB create (pooled or not) in NIO layer.

> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: Screen Shot 2017-11-21 at 4.43.36 PM.png, Screen Shot 
> 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19354) [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151

2017-11-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268122#comment-16268122
 ] 

Hudson commented on HBASE-19354:


FAILURE: Integrated in Jenkins build HBase-1.2-JDK8 #279 (See 
[https://builds.apache.org/job/HBase-1.2-JDK8/279/])
HBASE-19354 [branch-1] Build using a jdk that is beyond ubuntu trusty's (stack: 
rev e77c5787491045ab45884d4ccceb6c66f51efe7d)
* (edit) dev-support/docker/Dockerfile


> [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151
> ---
>
> Key: HBASE-19354
> URL: https://issues.apache.org/jira/browse/HBASE-19354
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: stack
>Assignee: stack
> Attachments: HBASE-19354.branch-1.2.001.patch
>
>
> HDFS mini cluster hangs when it runs on openjdk 151. See parent issue where 
> [~chia7712] turns up the hang and then Xiao Chen confirms up in the parent 
> issue in a comment. Lets do what Xiao Chen suggests which comes from a Todd 
> approach over in kudu where we install newer versions of the jdk so we 
> by-pass the hdfs hangs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19354) [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151

2017-11-27 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19354:
--
Release Note: 
The jdks in ubuntu trusty  (14.04), our Docker os, don't work. HDFS hangs on 
trusty's openjdk-7 151 which makes for a bunch of hbase tests that just 
hang/timeout. See HBASE-19204. This patch adds azul jdks to our Docker env 
because they are available, and later versions of openjdk (openjdk-7 161)., a 
version that does not seem to hang HDFS.

This change adds the azul repo and then install its jdks and rename the azul 
jvms as though they were from openjdk (otherwise yetus won't set JAVA_HOME; it 
does find /usr/lib/jvm/ -name java-* -type d so a symlink to the zulu jvms 
won't work).

> [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151
> ---
>
> Key: HBASE-19354
> URL: https://issues.apache.org/jira/browse/HBASE-19354
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: stack
>Assignee: stack
> Attachments: HBASE-19354.branch-1.2.001.patch
>
>
> HDFS mini cluster hangs when it runs on openjdk 151. See parent issue where 
> [~chia7712] turns up the hang and then Xiao Chen confirms up in the parent 
> issue in a comment. Lets do what Xiao Chen suggests which comes from a Todd 
> approach over in kudu where we install newer versions of the jdk so we 
> by-pass the hdfs hangs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19354) [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151

2017-11-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268120#comment-16268120
 ] 

Hudson commented on HBASE-19354:


FAILURE: Integrated in Jenkins build HBase-1.2-JDK7 #282 (See 
[https://builds.apache.org/job/HBase-1.2-JDK7/282/])
HBASE-19354 [branch-1] Build using a jdk that is beyond ubuntu trusty's (stack: 
rev e77c5787491045ab45884d4ccceb6c66f51efe7d)
* (edit) dev-support/docker/Dockerfile


> [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151
> ---
>
> Key: HBASE-19354
> URL: https://issues.apache.org/jira/browse/HBASE-19354
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: stack
>Assignee: stack
> Attachments: HBASE-19354.branch-1.2.001.patch
>
>
> HDFS mini cluster hangs when it runs on openjdk 151. See parent issue where 
> [~chia7712] turns up the hang and then Xiao Chen confirms up in the parent 
> issue in a comment. Lets do what Xiao Chen suggests which comes from a Todd 
> approach over in kudu where we install newer versions of the jdk so we 
> by-pass the hdfs hangs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19354) [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151

2017-11-27 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268119#comment-16268119
 ] 

stack commented on HBASE-19354:
---

Pushed to branch-1.2. Let me try some test runs and see how it does. Will push 
to the rest of branch-1 if it looks good.

> [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151
> ---
>
> Key: HBASE-19354
> URL: https://issues.apache.org/jira/browse/HBASE-19354
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: stack
>Assignee: stack
> Attachments: HBASE-19354.branch-1.2.001.patch
>
>
> HDFS mini cluster hangs when it runs on openjdk 151. See parent issue where 
> [~chia7712] turns up the hang and then Xiao Chen confirms up in the parent 
> issue in a comment. Lets do what Xiao Chen suggests which comes from a Todd 
> approach over in kudu where we install newer versions of the jdk so we 
> by-pass the hdfs hangs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19354) [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151

2017-11-27 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268115#comment-16268115
 ] 

stack commented on HBASE-19354:
---

The run in the parent just completed and looks reasonable. One test failure. 
The count of tests run is very close (more) than what I see when I run on local 
hw. Let me push this and try some outstanding patches waiting on branch-1.2. to 
become sane again.

> [branch-1] Build using a jdk that is beyond ubuntu trusty's openjdk-151
> ---
>
> Key: HBASE-19354
> URL: https://issues.apache.org/jira/browse/HBASE-19354
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: stack
>Assignee: stack
> Attachments: HBASE-19354.branch-1.2.001.patch
>
>
> HDFS mini cluster hangs when it runs on openjdk 151. See parent issue where 
> [~chia7712] turns up the hang and then Xiao Chen confirms up in the parent 
> issue in a comment. Lets do what Xiao Chen suggests which comes from a Todd 
> approach over in kudu where we install newer versions of the jdk so we 
> by-pass the hdfs hangs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19204) branch-1.2 times out and is taking 6-7 hours to complete

2017-11-27 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268114#comment-16268114
 ] 

stack commented on HBASE-19204:
---

This looks better. No rampant timeouts. Just the single test failure. Let me 
commit the subtask.

> branch-1.2 times out and is taking 6-7 hours to complete
> 
>
> Key: HBASE-19204
> URL: https://issues.apache.org/jira/browse/HBASE-19204
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
> Attachments: 19024.branch-1.2.004.patch, 
> HBASE-19024.branch-1.2.002.patch, HBASE-19024.branch-1.2.002.patch, 
> HBASE-19024.branch-1.2.003.patch, HBASE-19204.branch-1.2.001.patch, 
> HBASE-19204.branch-1.2.002.patch, HBASE-19204.branch-1.2.003.patch, 
> HBASE-19204.branch-1.2.004.patch, HBASE-19204.branch-1.2.005.patch, 
> HBASE-19204.branch-1.2.005.patch, HBASE-19204.branch-1.2.005.patch, 
> HBASE-19204.branch-1.2.005.patch, HBASE-19204.branch-1.2.006.patch, 
> HBASE-19204.branch-1.2.007.patch
>
>
> Sean has been looking at tooling and infra. This Umbrellas is about looking 
> at actual tests. For example, running locally on dedicated machine I picked a 
> random test, TestPerColumnFamilyFlush. In my test run, it wrote 16M lines. It 
> seems to be having zk issues but it is catching interrupts and ignoring them 
> ([~carp84] fixed this in later versions over in HBASE-18441).
> Let me try and do some fixup under this umbrella so we can get a 1.2.7 out 
> the door.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19112) Suspect methods on Cell to be deprecated

2017-11-27 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268113#comment-16268113
 ] 

stack commented on HBASE-19112:
---

[~chia7712] But we do not want to expose Tags on the client/user -side in 
hbase2; they are for server-side only.

> Suspect methods on Cell to be deprecated
> 
>
> Key: HBASE-19112
> URL: https://issues.apache.org/jira/browse/HBASE-19112
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Josh Elser
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19112_branch-2.patch
>
>
> [~chia7712] suggested on the [mailing 
> list|https://lists.apache.org/thread.html/e6de9af26d9b888a358ba48bf74655ccd893573087c032c0fcf01585@%3Cdev.hbase.apache.org%3E]
>  that we have some methods on Cell which should be deprecated for removal:
> * {{#getType()}}
> * {{#getTimestamp()}}
> * {{#getTag()}}
> * {{#getSequenceId()}}
> Let's make a pass over these (and maybe the rest) to make sure that there 
> aren't others which are either implementation details or methods returning 
> now-private-marked classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18601) Update Htrace to 4.2

2017-11-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268086#comment-16268086
 ] 

Hudson commented on HBASE-18601:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4129 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4129/])
HBASE-18601 Update Htrace to 4.2 (addendum) (chia7712: rev 
7c1c370f2fcde83b702011046a8b6fb0b01a263e)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/trace/TestHTraceHooks.java
* (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/trace/TraceTree.java


> Update Htrace to 4.2
> 
>
> Key: HBASE-18601
> URL: https://issues.apache.org/jira/browse/HBASE-18601
> Project: HBase
>  Issue Type: Improvement
>  Components: dependencies, tracing
>Affects Versions: 2.0.0, 3.0.0
>Reporter: Tamas Penzes
>Assignee: Balazs Meszaros
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18601.master.001.patch, 
> HBASE-18601.master.002.patch, HBASE-18601.master.003 (3).patch, 
> HBASE-18601.master.003.patch, HBASE-18601.master.004.patch, 
> HBASE-18601.master.004.patch, HBASE-18601.master.005.patch, 
> HBASE-18601.master.006.patch, HBASE-18601.master.006.patch, 
> HBASE-18601.master.007.patch, HBASE-18601.master.007.patch, 
> HBASE-18601.master.007.patch, HBASE-18601.master.008.patch, 
> HBASE-18601.master.009.patch, HBASE-18601.master.009.patch, 
> HBASE-18601.master.010.patch, HBASE-18601.master.010.patch, 
> HBASE-18601.master.011.patch, HBASE-18601.master.012.patch, 
> HBASE-18601.master.013.patch, HBASE-18601.master.014.patch, 
> HBASE-18601.master.014.patch, HBASE-18601.master.015.patch, 
> HBASE-18601.master.016.patch
>
>
> HTrace is not perfectly integrated into HBase, the version 3.2.0 is buggy, 
> the upgrade to 4.x is not trivial and would take time. It might not worth to 
> keep it in this state, so would be better to remove it.
> Of course it doesn't mean tracing would be useless, just that in this form 
> the use of HTrace 3.2 might not add any value to the project and fixing it 
> would be far too much effort.
> -
> Based on the decision of the community we keep htrace now and update version



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19335) Fix waitUntilAllRegionsAssigned

2017-11-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268087#comment-16268087
 ] 

Hudson commented on HBASE-19335:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4129 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4129/])
HBASE-19335 Fix waitUntilAllRegionsAssigned(). Ignore assignments to (appy: rev 
e70b628544b377d4107d13c9fdbe95540f4fd9d7)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java
* (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseCluster.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
* (edit) 
hbase-it/src/test/java/org/apache/hadoop/hbase/DistributedHBaseCluster.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateStore.java


> Fix waitUntilAllRegionsAssigned
> ---
>
> Key: HBASE-19335
> URL: https://issues.apache.org/jira/browse/HBASE-19335
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
>  Labels: test
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19335.master.001.patch, 
> HBASE-19335.master.002.patch
>
>
> Found when debugging flaky test TestRegionObserverInterface#testRecovery.
> In the end, the test does the following:
> - Kills the RS
> - Waits for all regions to be assigned
> - Some validation (unrelated)
> - Cleanup: delete table.
> {noformat}
>   cluster.killRegionServer(rs1.getRegionServer().getServerName());
>   Threads.sleep(1000); // Let the kill soak in.
>   util.waitUntilAllRegionsAssigned(tableName);
>   LOG.info("All regions assigned");
>   verifyMethodResult(SimpleRegionObserver.class,
> new String[] { "getCtPreReplayWALs", "getCtPostReplayWALs", 
> "getCtPreWALRestore",
> "getCtPostWALRestore", "getCtPrePut", "getCtPostPut" },
> tableName, new Integer[] { 1, 1, 2, 2, 0, 0 });
> } finally {
>   util.deleteTable(tableName);
>   table.close();
> }
>   }
> {noformat}
> However, looking at test logs, found that we had overlapping Assigns with 
> Unassigns. As a result, regions ended up 'stuck in RIT' and the test timeout.
> Assigns were from the ServerCrashRecovery and Unassigns were from the 
> deleteTable cleanup.
> Which begs the question, why did HBTU.waitUntilAllRegionsAssigned(tableName) 
> not wait until recovery was complete.
> Answer: Looks like that function is only meant for sunny scenarios but not 
> for crashes. It iterates over meta and just [checks for *some value* in the 
> server 
> column|https://github.com/apache/hbase/blob/cdc2bb17ff38dcbd273cf501aea565006e995a06/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java#L3421]
>  which is obviously present and equal to the server that was just killed.
> This bug must be affecting other fault tolerance tests too and fixing it may 
> fix more than just one test, hopefully.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19318) MasterRpcServices#getSecurityCapabilities explicitly checks for the HBase AccessController implementation

2017-11-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268085#comment-16268085
 ] 

Hudson commented on HBASE-19318:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4129 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4129/])
HBASE-19318 Use the PB service interface as the judge of whether some (elserj: 
rev 5c1acf4792a7f7b6a5ace11c2fa4d172ede46b4e)
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterCoprocessorServices.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java


> MasterRpcServices#getSecurityCapabilities explicitly checks for the HBase 
> AccessController implementation
> -
>
> Key: HBASE-19318
> URL: https://issues.apache.org/jira/browse/HBASE-19318
> Project: HBase
>  Issue Type: Bug
>  Components: master, security
>Reporter: Sharmadha Sainath
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19318.001.branch-2.patch, 
> HBASE-19318.002.branch-2.patch
>
>
> Sharmadha brought a failure to my attention trying to use Ranger with HBase 
> 2.0 where the {{grant}} command was erroring out unexpectedly. The cluster 
> had the Ranger-specific coprocessors deployed, per what was previously 
> working on the HBase 1.1 line.
> After some digging, I found that the the Master is actually making a check 
> explicitly for a Coprocessor that has the name 
> {{org.apache.hadoop.hbase.security.access.AccessController}} (short name or 
> full name), instead of looking for a deployed coprocessor which can be 
> assigned to {{AccessController}} (which is what Ranger does). We have the 
> CoprocessorHost methods to do the latter already implemented; it strikes me 
> that we just accidentally used the wrong method in MasterRpcServices.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19290) Reduce zk request when doing split log

2017-11-27 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268045#comment-16268045
 ] 

binlijin commented on HBASE-19290:
--

bq. Can you please walk me through a full case explaining need of that "if" 
condition and why value of grabbedTask=0 is special.
The task loop is the following stage
(1) getTaskList, get tasks from zookeeper splitLogZNode node.
 issue zookeeper request
(2) for loop trying to grab task for every tasks  
  issue zookeeper request
(3) while loop
  sleep when seq_start == taskReadySeq.get(), else skip
When grabbedTask=0 and skip stage 3 the while loop, regionserver will trying to 
grab tasks again and may not get any tasks and giving pressure to zookeeper, so 
throttling it. 
When grabbedTask=1 and skip stage 3 the while loop, regionserver will trying to 
grab tasks again and if not get any tasks this round so will throttle, and if 
get any task may try grab tasks next round again.


> Reduce zk request when doing split log
> --
>
> Key: HBASE-19290
> URL: https://issues.apache.org/jira/browse/HBASE-19290
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-19290.master.001.patch, 
> HBASE-19290.master.002.patch, HBASE-19290.master.003.patch, 
> HBASE-19290.master.004.patch
>
>
> We observe once the cluster has 1000+ nodes and when hundreds of nodes abort 
> and doing split log, the split is very very slow, and we find the 
> regionserver and master wait on the zookeeper response, so we need to reduce 
> zookeeper request and pressure for big cluster.
> (1) Reduce request to rsZNode, every time calculateAvailableSplitters will 
> get rsZNode's children from zookeeper, when cluster is huge, this is heavy. 
> This patch reduce the request. 
> (2) When the regionserver has max split tasks running, it may still trying to 
> grab task and issue zookeeper request, we should sleep and wait until we can 
> grab tasks again.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19346) Use EventLoopGroup to create AsyncFSOutput

2017-11-27 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268027#comment-16268027
 ] 

Duo Zhang commented on HBASE-19346:
---

OK, the problem is the transparent crypto. Let me take a look.

> Use EventLoopGroup to create AsyncFSOutput
> --
>
> Key: HBASE-19346
> URL: https://issues.apache.org/jira/browse/HBASE-19346
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19346-v1.patch, HBASE-19346-v2.patch, 
> HBASE-19346.patch
>
>
> So that we can use different event loop to manage the connections to 
> different datanodes. And since EventLoop itself is also an EventLoopGroup, we 
> could still use the event loop to create AsyncFSOutput so the logic of 
> AsyncFSWAL will not be broken.
> Will open a new issue to modify AsyncFSWAL and finally we can use multiple 
> event loop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19318) MasterRpcServices#getSecurityCapabilities explicitly checks for the HBase AccessController implementation

2017-11-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268024#comment-16268024
 ] 

Hudson commented on HBASE-19318:


FAILURE: Integrated in Jenkins build HBase-2.0 #927 (See 
[https://builds.apache.org/job/HBase-2.0/927/])
HBASE-19318 Use the PB service interface as the judge of whether some (elserj: 
rev e42d20f8ddd4c27d6138e71ec78d0fbfe59790a4)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterCoprocessorServices.java


> MasterRpcServices#getSecurityCapabilities explicitly checks for the HBase 
> AccessController implementation
> -
>
> Key: HBASE-19318
> URL: https://issues.apache.org/jira/browse/HBASE-19318
> Project: HBase
>  Issue Type: Bug
>  Components: master, security
>Reporter: Sharmadha Sainath
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19318.001.branch-2.patch, 
> HBASE-19318.002.branch-2.patch
>
>
> Sharmadha brought a failure to my attention trying to use Ranger with HBase 
> 2.0 where the {{grant}} command was erroring out unexpectedly. The cluster 
> had the Ranger-specific coprocessors deployed, per what was previously 
> working on the HBase 1.1 line.
> After some digging, I found that the the Master is actually making a check 
> explicitly for a Coprocessor that has the name 
> {{org.apache.hadoop.hbase.security.access.AccessController}} (short name or 
> full name), instead of looking for a deployed coprocessor which can be 
> assigned to {{AccessController}} (which is what Ranger does). We have the 
> CoprocessorHost methods to do the latter already implemented; it strikes me 
> that we just accidentally used the wrong method in MasterRpcServices.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19335) Fix waitUntilAllRegionsAssigned

2017-11-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268026#comment-16268026
 ] 

Hudson commented on HBASE-19335:


FAILURE: Integrated in Jenkins build HBase-2.0 #927 (See 
[https://builds.apache.org/job/HBase-2.0/927/])
HBASE-19335 Fix waitUntilAllRegionsAssigned(). Ignore assignments to (appy: rev 
03845614238de2c7ee040b709a26562e19918ff6)
* (edit) 
hbase-it/src/test/java/org/apache/hadoop/hbase/DistributedHBaseCluster.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateStore.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java
* (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseCluster.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java


> Fix waitUntilAllRegionsAssigned
> ---
>
> Key: HBASE-19335
> URL: https://issues.apache.org/jira/browse/HBASE-19335
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
>  Labels: test
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19335.master.001.patch, 
> HBASE-19335.master.002.patch
>
>
> Found when debugging flaky test TestRegionObserverInterface#testRecovery.
> In the end, the test does the following:
> - Kills the RS
> - Waits for all regions to be assigned
> - Some validation (unrelated)
> - Cleanup: delete table.
> {noformat}
>   cluster.killRegionServer(rs1.getRegionServer().getServerName());
>   Threads.sleep(1000); // Let the kill soak in.
>   util.waitUntilAllRegionsAssigned(tableName);
>   LOG.info("All regions assigned");
>   verifyMethodResult(SimpleRegionObserver.class,
> new String[] { "getCtPreReplayWALs", "getCtPostReplayWALs", 
> "getCtPreWALRestore",
> "getCtPostWALRestore", "getCtPrePut", "getCtPostPut" },
> tableName, new Integer[] { 1, 1, 2, 2, 0, 0 });
> } finally {
>   util.deleteTable(tableName);
>   table.close();
> }
>   }
> {noformat}
> However, looking at test logs, found that we had overlapping Assigns with 
> Unassigns. As a result, regions ended up 'stuck in RIT' and the test timeout.
> Assigns were from the ServerCrashRecovery and Unassigns were from the 
> deleteTable cleanup.
> Which begs the question, why did HBTU.waitUntilAllRegionsAssigned(tableName) 
> not wait until recovery was complete.
> Answer: Looks like that function is only meant for sunny scenarios but not 
> for crashes. It iterates over meta and just [checks for *some value* in the 
> server 
> column|https://github.com/apache/hbase/blob/cdc2bb17ff38dcbd273cf501aea565006e995a06/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java#L3421]
>  which is obviously present and equal to the server that was just killed.
> This bug must be affecting other fault tolerance tests too and fixing it may 
> fix more than just one test, hopefully.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18601) Update Htrace to 4.2

2017-11-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268025#comment-16268025
 ] 

Hudson commented on HBASE-18601:


FAILURE: Integrated in Jenkins build HBase-2.0 #927 (See 
[https://builds.apache.org/job/HBase-2.0/927/])
HBASE-18601 Update Htrace to 4.2 (addendum) (chia7712: rev 
95e4f059a3b5937d47deda49f40be87af5b12c53)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/trace/TestHTraceHooks.java
* (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/trace/TraceTree.java


> Update Htrace to 4.2
> 
>
> Key: HBASE-18601
> URL: https://issues.apache.org/jira/browse/HBASE-18601
> Project: HBase
>  Issue Type: Improvement
>  Components: dependencies, tracing
>Affects Versions: 2.0.0, 3.0.0
>Reporter: Tamas Penzes
>Assignee: Balazs Meszaros
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18601.master.001.patch, 
> HBASE-18601.master.002.patch, HBASE-18601.master.003 (3).patch, 
> HBASE-18601.master.003.patch, HBASE-18601.master.004.patch, 
> HBASE-18601.master.004.patch, HBASE-18601.master.005.patch, 
> HBASE-18601.master.006.patch, HBASE-18601.master.006.patch, 
> HBASE-18601.master.007.patch, HBASE-18601.master.007.patch, 
> HBASE-18601.master.007.patch, HBASE-18601.master.008.patch, 
> HBASE-18601.master.009.patch, HBASE-18601.master.009.patch, 
> HBASE-18601.master.010.patch, HBASE-18601.master.010.patch, 
> HBASE-18601.master.011.patch, HBASE-18601.master.012.patch, 
> HBASE-18601.master.013.patch, HBASE-18601.master.014.patch, 
> HBASE-18601.master.014.patch, HBASE-18601.master.015.patch, 
> HBASE-18601.master.016.patch
>
>
> HTrace is not perfectly integrated into HBase, the version 3.2.0 is buggy, 
> the upgrade to 4.x is not trivial and would take time. It might not worth to 
> keep it in this state, so would be better to remove it.
> Of course it doesn't mean tracing would be useless, just that in this form 
> the use of HTrace 3.2 might not add any value to the project and fixing it 
> would be far too much effort.
> -
> Based on the decision of the community we keep htrace now and update version



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-17049) Do not issue sync request when there are still entries in ringbuffer

2017-11-27 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268012#comment-16268012
 ] 

Guanghao Zhang commented on HBASE-17049:


bq. There is performance regression in CompletableFuture before jdk8u131
Yes, this may be a problem. Let me rerun the pe test with latest jdk8u152.

> Do not issue sync request when there are still entries in ringbuffer
> 
>
> Key: HBASE-17049
> URL: https://issues.apache.org/jira/browse/HBASE-17049
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-17049.patch, delay-sync.patch
>
>
> https://issues.apache.org/jira/browse/HBASE-16890?focusedCommentId=15647590=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15647590



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19096) Add RowMutions batch support in AsyncTable

2017-11-27 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267997#comment-16267997
 ] 

Jerry He commented on HBASE-19096:
--

v3 fixes checkstyle warnings.

> Add RowMutions batch support in AsyncTable
> --
>
> Key: HBASE-19096
> URL: https://issues.apache.org/jira/browse/HBASE-19096
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jerry He
>Assignee: Jerry He
> Fix For: 2.0.0
>
> Attachments: HBASE-19096-master-v2.patch, 
> HBASE-19096-master-v3.patch, HBASE-19096-master.patch
>
>
> Batch support for RowMutations has been added in the Table interface, but is 
> not in AsyncTable. This JIRA will add it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19342) fix TestTableBasedReplicationSourceManagerImpl#testRemovePeerMetricsCleanup

2017-11-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267996#comment-16267996
 ] 

Hadoop QA commented on HBASE-19342:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
10s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} hbase-server: The patch generated 0 new + 26 
unchanged - 2 fixed = 26 total (was 28) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
22s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
48m 34s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 90m 
15s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}156m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19342 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12899331/HBASE-19342.v0.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 216e99acfb32 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / 7c1c370f2f |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/10055/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/10055/console |
| Powered by | Apache Yetus 0.6.0   http://yetus.apache.org |


This message was automatically generated.



> fix TestTableBasedReplicationSourceManagerImpl#testRemovePeerMetricsCleanup
> 

[jira] [Updated] (HBASE-19096) Add RowMutions batch support in AsyncTable

2017-11-27 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-19096:
-
Attachment: HBASE-19096-master-v3.patch

> Add RowMutions batch support in AsyncTable
> --
>
> Key: HBASE-19096
> URL: https://issues.apache.org/jira/browse/HBASE-19096
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jerry He
>Assignee: Jerry He
> Fix For: 2.0.0
>
> Attachments: HBASE-19096-master-v2.patch, 
> HBASE-19096-master-v3.patch, HBASE-19096-master.patch
>
>
> Batch support for RowMutations has been added in the Table interface, but is 
> not in AsyncTable. This JIRA will add it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-17049) Do not issue sync request when there are still entries in ringbuffer

2017-11-27 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267989#comment-16267989
 ] 

Duo Zhang commented on HBASE-17049:
---

Oh one more thing, [~chancelq] and [~ram_krish], what is the jdk version do you 
use? There is performance regression in CompletableFuture before jdk8u131.

Thanks.

> Do not issue sync request when there are still entries in ringbuffer
> 
>
> Key: HBASE-17049
> URL: https://issues.apache.org/jira/browse/HBASE-17049
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-17049.patch, delay-sync.patch
>
>
> https://issues.apache.org/jira/browse/HBASE-16890?focusedCommentId=15647590=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15647590



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19325) Pass a list of server name to postClearDeadServers

2017-11-27 Thread Guangxu Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267974#comment-16267974
 ] 

Guangxu Cheng commented on HBASE-19325:
---

Thanks all for reviewing.

> Pass a list of server name to postClearDeadServers
> --
>
> Key: HBASE-19325
> URL: https://issues.apache.org/jira/browse/HBASE-19325
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-beta-2
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
> Fix For: 1.4.0, 2.0.0-beta-1
>
> Attachments: HBASE-19325.branch-1.001.patch, 
> HBASE-19325.branch-1.001.patch, HBASE-19325.branch-2.001.patch
>
>
> Over on the tail of HBASE-18131. [~chia7712] said 
> {quote}
> (Revisiting the AccessController remind me of this issue) 
> Could we remove the duplicate code on the server side? Why not pass a list of 
> server name to postClearDeadServers and postListDeadServers?
> {quote}
> The duplicate code has been removed in HBASE-19131.Now Pass a list of server 
> name to postClearDeadServers



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19346) Use EventLoopGroup to create AsyncFSOutput

2017-11-27 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267970#comment-16267970
 ] 

Duo Zhang commented on HBASE-19346:
---

{quote}
java.lang.IllegalStateException: should call flush first before calling close
at 
org.apache.hadoop.hbase.io.asyncfs.TestSaslFanOutOneBlockAsyncDFSOutput.test(TestSaslFanOutOneBlockAsyncDFSOutput.java:252)
{quote}

After checking the code carefully, I think this error is possible. In 
FanOutOneBlockAsyncDFSOutput.completed, we will first try completing the future 
and then remove it from the waitingAckQueue. For this UT, the test itself is 
not run inside event loop so it is possible that we call close before we remove 
the completed entry from waitingAckQueue. Let me fix.

{quote}
org.apache.hadoop.fs.ChecksumException: Checksum error: 
/test_1__protection_authentication__encryption___cipherSuite___transparent_enc_true_/test
 at 0 exp: 1198447100 got: -321676678
at 
org.apache.hadoop.hbase.io.asyncfs.TestSaslFanOutOneBlockAsyncDFSOutput.test(TestSaslFanOutOneBlockAsyncDFSOutput.java:252)
{quote}

For this one, seems it is thrown when reading the file. This is a bit strange 
as the block has already been committed which means the DN has already 
confirmed that the checksums are all correct otherwise the write will fail.

Anyway, let me fix the former one and try again.

> Use EventLoopGroup to create AsyncFSOutput
> --
>
> Key: HBASE-19346
> URL: https://issues.apache.org/jira/browse/HBASE-19346
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19346-v1.patch, HBASE-19346-v2.patch, 
> HBASE-19346.patch
>
>
> So that we can use different event loop to manage the connections to 
> different datanodes. And since EventLoop itself is also an EventLoopGroup, we 
> could still use the event loop to create AsyncFSOutput so the logic of 
> AsyncFSWAL will not be broken.
> Will open a new issue to modify AsyncFSWAL and finally we can use multiple 
> event loop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19338) Performance regression in RegionServerRpcQuotaManager to get ugi

2017-11-27 Thread binlijin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

binlijin updated HBASE-19338:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Performance regression in RegionServerRpcQuotaManager to get ugi 
> -
>
> Key: HBASE-19338
> URL: https://issues.apache.org/jira/browse/HBASE-19338
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.0.0-beta-2
>Reporter: binlijin
>Assignee: binlijin
>Priority: Critical
> Attachments: 19338.master.003.patch, 19338.master.003.patch, 
> HBASE-19338.master.001.patch, HBASE-19338.master.002.patch
>
>
> we find hbase-2.0.0-beta-1.SNAPSHOT have performance regression with yscb put 
>  and have some finding.  
> {code}
> "RpcServer.default.FPBQ.Fifo.handler=131,queue=17,port=16020" #245 daemon 
> prio=5 os_prio=0 tid=0x7fc82b22e000 nid=0x3a5db waiting for monitor entry 
> [0x7fc50fafa000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647)
> - waiting to lock <0x7fcaedc20830> (a java.lang.Class for 
> org.apache.hadoop.security.UserGroupInformation)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.(User.java:264)
> at org.apache.hadoop.hbase.security.User.getCurrent(User.java:162)
> at 
> org.apache.hadoop.hbase.quotas.RegionServerRpcQuotaManager.checkQuota(RegionServerRpcQuotaManager.java:179)
> at 
> org.apache.hadoop.hbase.quotas.RegionServerRpcQuotaManager.checkQuota(RegionServerRpcQuotaManager.java:162)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2521)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41560)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:325)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:305)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19338) Performance regression in RegionServerRpcQuotaManager to get ugi

2017-11-27 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267966#comment-16267966
 ] 

binlijin commented on HBASE-19338:
--

Push to master and branch-2

> Performance regression in RegionServerRpcQuotaManager to get ugi 
> -
>
> Key: HBASE-19338
> URL: https://issues.apache.org/jira/browse/HBASE-19338
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.0.0-beta-2
>Reporter: binlijin
>Assignee: binlijin
>Priority: Critical
> Attachments: 19338.master.003.patch, 19338.master.003.patch, 
> HBASE-19338.master.001.patch, HBASE-19338.master.002.patch
>
>
> we find hbase-2.0.0-beta-1.SNAPSHOT have performance regression with yscb put 
>  and have some finding.  
> {code}
> "RpcServer.default.FPBQ.Fifo.handler=131,queue=17,port=16020" #245 daemon 
> prio=5 os_prio=0 tid=0x7fc82b22e000 nid=0x3a5db waiting for monitor entry 
> [0x7fc50fafa000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647)
> - waiting to lock <0x7fcaedc20830> (a java.lang.Class for 
> org.apache.hadoop.security.UserGroupInformation)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.(User.java:264)
> at org.apache.hadoop.hbase.security.User.getCurrent(User.java:162)
> at 
> org.apache.hadoop.hbase.quotas.RegionServerRpcQuotaManager.checkQuota(RegionServerRpcQuotaManager.java:179)
> at 
> org.apache.hadoop.hbase.quotas.RegionServerRpcQuotaManager.checkQuota(RegionServerRpcQuotaManager.java:162)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2521)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41560)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:325)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:305)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19356) Provide delegators and base implementation for Phoenix implemented interfaces

2017-11-27 Thread Zach York (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267951#comment-16267951
 ] 

Zach York commented on HBASE-19356:
---

I'm a fan of default implementations for this like what [~appy] did for the 
coprocessor interfaces in HBASE-17312

> Provide delegators and base implementation for Phoenix implemented interfaces
> -
>
> Key: HBASE-19356
> URL: https://issues.apache.org/jira/browse/HBASE-19356
> Project: HBase
>  Issue Type: Improvement
>Reporter: James Taylor
>
> Many of the changes Phoenix needs to make for various branches to support 
> different versions of HBase are due to new methods being added to interfaces. 
> Often times Phoenix can  use a noop or simply needs to add the new method to 
> it's delegate implementor. It'd be helpful if HBase provided base 
> implementations and delegates that Phoenix could use instead. Here are some 
> that come to mind:
> - RegionScanner
> - HTableInterface (Table interface now?)
> - RegionObserver
> There are likely others that [~rajeshbabu], [~an...@apache.org], and 
> [~elserj] would remember.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19356) Provide delegators and base implementation for Phoenix implemented interfaces

2017-11-27 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267944#comment-16267944
 ] 

James Taylor commented on HBASE-19356:
--

FYI, [~lhofhansl].

> Provide delegators and base implementation for Phoenix implemented interfaces
> -
>
> Key: HBASE-19356
> URL: https://issues.apache.org/jira/browse/HBASE-19356
> Project: HBase
>  Issue Type: Improvement
>Reporter: James Taylor
>
> Many of the changes Phoenix needs to make for various branches to support 
> different versions of HBase are due to new methods being added to interfaces. 
> Often times Phoenix can  use a noop or simply needs to add the new method to 
> it's delegate implementor. It'd be helpful if HBase provided base 
> implementations and delegates that Phoenix could use instead. Here are some 
> that come to mind:
> - RegionScanner
> - HTableInterface (Table interface now?)
> - RegionObserver
> There are likely others that [~rajeshbabu], [~an...@apache.org], and 
> [~elserj] would remember.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19356) Provide delegators and base implementation for Phoenix implemented interfaces

2017-11-27 Thread James Taylor (JIRA)
James Taylor created HBASE-19356:


 Summary: Provide delegators and base implementation for Phoenix 
implemented interfaces
 Key: HBASE-19356
 URL: https://issues.apache.org/jira/browse/HBASE-19356
 Project: HBase
  Issue Type: Improvement
Reporter: James Taylor


Many of the changes Phoenix needs to make for various branches to support 
different versions of HBase are due to new methods being added to interfaces. 
Often times Phoenix can  use a noop or simply needs to add the new method to 
it's delegate implementor. It'd be helpful if HBase provided base 
implementations and delegates that Phoenix could use instead. Here are some 
that come to mind:
- RegionScanner
- HTableInterface (Table interface now?)
- RegionObserver

There are likely others that [~rajeshbabu], [~an...@apache.org], and [~elserj] 
would remember.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19355) Missing dependency on hbase-zookeeper module causes CopyTable to fail

2017-11-27 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-19355:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.0-beta-1
   Status: Resolved  (was: Patch Available)

Thanks for the review, Appy.

> Missing dependency on hbase-zookeeper module causes CopyTable to fail
> -
>
> Key: HBASE-19355
> URL: https://issues.apache.org/jira/browse/HBASE-19355
> Project: HBase
>  Issue Type: Bug
>Reporter: Romil Choksi
>Assignee: Ted Yu
> Fix For: 2.0.0-beta-1
>
> Attachments: 19355.v1.txt
>
>
> Romil reported seeing the following error when running CopyTable:
> {code}
> 2017-11-27 23:14:38,003 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1511805117287_0023_m_00_1, Status : FAILED
> Error: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.zookeeper.ZKWatcher
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at org.apache.hadoop.hbase.mapreduce.Import$Importer.setup(Import.java:614)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:794)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {code}
> This was due to missing dependency on hbase-zookeeper module.
> Once dependency is added through ZKWatcher.class, the CopyTable  job can 
> succeed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19355) Missing dependency on hbase-zookeeper module causes CopyTable to fail

2017-11-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267935#comment-16267935
 ] 

Hadoop QA commented on HBASE-19355:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
2s{color} | {color:blue} The patch file was not named according to hbase's 
naming conventions. Please see 
https://yetus.apache.org/documentation/0.6.0/precommit-patchnames for 
instructions. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
34s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
28s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
54m 27s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
27s{color} | {color:green} hbase-mapreduce in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
10s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 82m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19355 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12899511/19355.v1.txt |
| Optional Tests |  asflicense  javac  javadoc  unit  shadedjars  hadoopcheck  
xml  compile  findbugs  hbaseanti  checkstyle  |
| uname | Linux 985c9a69ea92 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / 7c1c370f2f |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 

[jira] [Commented] (HBASE-19346) Use EventLoopGroup to create AsyncFSOutput

2017-11-27 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267931#comment-16267931
 ] 

Duo Zhang commented on HBASE-19346:
---

Good. Seems the newly added pipeline flush test find some problems. Let me 
check.

> Use EventLoopGroup to create AsyncFSOutput
> --
>
> Key: HBASE-19346
> URL: https://issues.apache.org/jira/browse/HBASE-19346
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19346-v1.patch, HBASE-19346-v2.patch, 
> HBASE-19346.patch
>
>
> So that we can use different event loop to manage the connections to 
> different datanodes. And since EventLoop itself is also an EventLoopGroup, we 
> could still use the event loop to create AsyncFSOutput so the logic of 
> AsyncFSWAL will not be broken.
> Will open a new issue to modify AsyncFSWAL and finally we can use multiple 
> event loop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19338) Performance regression in RegionServerRpcQuotaManager to get ugi

2017-11-27 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267923#comment-16267923
 ] 

binlijin commented on HBASE-19338:
--

Will commit it shortly, thanks all for the review.

> Performance regression in RegionServerRpcQuotaManager to get ugi 
> -
>
> Key: HBASE-19338
> URL: https://issues.apache.org/jira/browse/HBASE-19338
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.0.0-beta-2
>Reporter: binlijin
>Assignee: binlijin
>Priority: Critical
> Attachments: 19338.master.003.patch, 19338.master.003.patch, 
> HBASE-19338.master.001.patch, HBASE-19338.master.002.patch
>
>
> we find hbase-2.0.0-beta-1.SNAPSHOT have performance regression with yscb put 
>  and have some finding.  
> {code}
> "RpcServer.default.FPBQ.Fifo.handler=131,queue=17,port=16020" #245 daemon 
> prio=5 os_prio=0 tid=0x7fc82b22e000 nid=0x3a5db waiting for monitor entry 
> [0x7fc50fafa000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647)
> - waiting to lock <0x7fcaedc20830> (a java.lang.Class for 
> org.apache.hadoop.security.UserGroupInformation)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.(User.java:264)
> at org.apache.hadoop.hbase.security.User.getCurrent(User.java:162)
> at 
> org.apache.hadoop.hbase.quotas.RegionServerRpcQuotaManager.checkQuota(RegionServerRpcQuotaManager.java:179)
> at 
> org.apache.hadoop.hbase.quotas.RegionServerRpcQuotaManager.checkQuota(RegionServerRpcQuotaManager.java:162)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2521)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41560)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:325)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:305)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19335) Fix waitUntilAllRegionsAssigned

2017-11-27 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267916#comment-16267916
 ] 

Ted Yu commented on HBASE-19335:


Thanks for fixing this flaky test, Appy.

> Fix waitUntilAllRegionsAssigned
> ---
>
> Key: HBASE-19335
> URL: https://issues.apache.org/jira/browse/HBASE-19335
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
>  Labels: test
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19335.master.001.patch, 
> HBASE-19335.master.002.patch
>
>
> Found when debugging flaky test TestRegionObserverInterface#testRecovery.
> In the end, the test does the following:
> - Kills the RS
> - Waits for all regions to be assigned
> - Some validation (unrelated)
> - Cleanup: delete table.
> {noformat}
>   cluster.killRegionServer(rs1.getRegionServer().getServerName());
>   Threads.sleep(1000); // Let the kill soak in.
>   util.waitUntilAllRegionsAssigned(tableName);
>   LOG.info("All regions assigned");
>   verifyMethodResult(SimpleRegionObserver.class,
> new String[] { "getCtPreReplayWALs", "getCtPostReplayWALs", 
> "getCtPreWALRestore",
> "getCtPostWALRestore", "getCtPrePut", "getCtPostPut" },
> tableName, new Integer[] { 1, 1, 2, 2, 0, 0 });
> } finally {
>   util.deleteTable(tableName);
>   table.close();
> }
>   }
> {noformat}
> However, looking at test logs, found that we had overlapping Assigns with 
> Unassigns. As a result, regions ended up 'stuck in RIT' and the test timeout.
> Assigns were from the ServerCrashRecovery and Unassigns were from the 
> deleteTable cleanup.
> Which begs the question, why did HBTU.waitUntilAllRegionsAssigned(tableName) 
> not wait until recovery was complete.
> Answer: Looks like that function is only meant for sunny scenarios but not 
> for crashes. It iterates over meta and just [checks for *some value* in the 
> server 
> column|https://github.com/apache/hbase/blob/cdc2bb17ff38dcbd273cf501aea565006e995a06/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java#L3421]
>  which is obviously present and equal to the server that was just killed.
> This bug must be affecting other fault tolerance tests too and fixing it may 
> fix more than just one test, hopefully.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19325) Pass a list of server name to postClearDeadServers

2017-11-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267906#comment-16267906
 ] 

Hudson commented on HBASE-19325:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4128 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4128/])
HBASE-19325 Pass a list of server name to postClearDeadServers (chia7712: rev 
5a0881a98b3575d900d483222e2fdfab15159656)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java


> Pass a list of server name to postClearDeadServers
> --
>
> Key: HBASE-19325
> URL: https://issues.apache.org/jira/browse/HBASE-19325
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-beta-2
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
> Fix For: 1.4.0, 2.0.0-beta-1
>
> Attachments: HBASE-19325.branch-1.001.patch, 
> HBASE-19325.branch-1.001.patch, HBASE-19325.branch-2.001.patch
>
>
> Over on the tail of HBASE-18131. [~chia7712] said 
> {quote}
> (Revisiting the AccessController remind me of this issue) 
> Could we remove the duplicate code on the server side? Why not pass a list of 
> server name to postClearDeadServers and postListDeadServers?
> {quote}
> The duplicate code has been removed in HBASE-19131.Now Pass a list of server 
> name to postClearDeadServers



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19319) Fix bug in synchronizing over ProcedureEvent

2017-11-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267907#comment-16267907
 ] 

Hudson commented on HBASE-19319:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4128 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4128/])
HBASE-19319 Fix bug in synchronizing over ProcedureEvent (appy: rev 
f8867166178e76598393437baac3849f4943ff16)
* (edit) 
hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/TestProcedureSchedulerConcurrency.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
* (edit) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/AbstractProcedureScheduler.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMergeTransactionOnCluster.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/locking/LockProcedure.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestMasterProcedureScheduler.java
* (edit) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureEvent.java
* (edit) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureScheduler.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestMasterProcedureEvents.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureEnv.java
* (edit) 
hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/TestProcedureEvents.java


> Fix bug in synchronizing over ProcedureEvent
> 
>
> Key: HBASE-19319
> URL: https://issues.apache.org/jira/browse/HBASE-19319
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19319.master.001.patch, 
> HBASE-19319.master.002.patch
>
>
> Following synchronizes over local variable rather than the original 
> ProcedureEvent object. Clearly a bug since this code block won't follow 
> exclusion with many of the synchronized methods in ProcedureEvent class.
> {code}
>  @Override
>   public void wakeEvents(final int count, final ProcedureEvent... events) {
> final boolean traceEnabled = LOG.isTraceEnabled();
> schedLock();
> try {
>   int waitingCount = 0;
>   for (int i = 0; i < count; ++i) {
> final ProcedureEvent event = events[i];
> synchronized (event) {
>   if (!event.isReady()) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19204) branch-1.2 times out and is taking 6-7 hours to complete

2017-11-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267890#comment-16267890
 ] 

Hadoop QA commented on HBASE-19204:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
45s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue}  0m  
7s{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-1.2 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} branch-1.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
35s{color} | {color:green} branch-1.2 passed with JDK v1.8.0_152 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
40s{color} | {color:green} branch-1.2 passed with JDK v1.7.0_161 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  5m 
21s{color} | {color:green} branch-1.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
21s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} branch-1.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
49s{color} | {color:green} branch-1.2 passed with JDK v1.8.0_152 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
46s{color} | {color:green} branch-1.2 passed with JDK v1.7.0_161 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed with JDK v1.8.0_152 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed with JDK v1.7.0_161 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  7m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 0s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} xml {color} | {color:red}  0m  0s{color} | 
{color:red} The patch has 1 ill-formed XML file(s). {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
17s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
30m 23s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| 

[jira] [Updated] (HBASE-19335) Fix waitUntilAllRegionsAssigned

2017-11-27 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-19335:
-
Labels: test  (was: )

> Fix waitUntilAllRegionsAssigned
> ---
>
> Key: HBASE-19335
> URL: https://issues.apache.org/jira/browse/HBASE-19335
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
>  Labels: test
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19335.master.001.patch, 
> HBASE-19335.master.002.patch
>
>
> Found when debugging flaky test TestRegionObserverInterface#testRecovery.
> In the end, the test does the following:
> - Kills the RS
> - Waits for all regions to be assigned
> - Some validation (unrelated)
> - Cleanup: delete table.
> {noformat}
>   cluster.killRegionServer(rs1.getRegionServer().getServerName());
>   Threads.sleep(1000); // Let the kill soak in.
>   util.waitUntilAllRegionsAssigned(tableName);
>   LOG.info("All regions assigned");
>   verifyMethodResult(SimpleRegionObserver.class,
> new String[] { "getCtPreReplayWALs", "getCtPostReplayWALs", 
> "getCtPreWALRestore",
> "getCtPostWALRestore", "getCtPrePut", "getCtPostPut" },
> tableName, new Integer[] { 1, 1, 2, 2, 0, 0 });
> } finally {
>   util.deleteTable(tableName);
>   table.close();
> }
>   }
> {noformat}
> However, looking at test logs, found that we had overlapping Assigns with 
> Unassigns. As a result, regions ended up 'stuck in RIT' and the test timeout.
> Assigns were from the ServerCrashRecovery and Unassigns were from the 
> deleteTable cleanup.
> Which begs the question, why did HBTU.waitUntilAllRegionsAssigned(tableName) 
> not wait until recovery was complete.
> Answer: Looks like that function is only meant for sunny scenarios but not 
> for crashes. It iterates over meta and just [checks for *some value* in the 
> server 
> column|https://github.com/apache/hbase/blob/cdc2bb17ff38dcbd273cf501aea565006e995a06/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java#L3421]
>  which is obviously present and equal to the server that was just killed.
> This bug must be affecting other fault tolerance tests too and fixing it may 
> fix more than just one test, hopefully.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19335) Fix waitUntilAllRegionsAssigned

2017-11-27 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-19335:
-
   Resolution: Fixed
Fix Version/s: 2.0.0-beta-1
   Status: Resolved  (was: Patch Available)

> Fix waitUntilAllRegionsAssigned
> ---
>
> Key: HBASE-19335
> URL: https://issues.apache.org/jira/browse/HBASE-19335
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
>  Labels: test
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19335.master.001.patch, 
> HBASE-19335.master.002.patch
>
>
> Found when debugging flaky test TestRegionObserverInterface#testRecovery.
> In the end, the test does the following:
> - Kills the RS
> - Waits for all regions to be assigned
> - Some validation (unrelated)
> - Cleanup: delete table.
> {noformat}
>   cluster.killRegionServer(rs1.getRegionServer().getServerName());
>   Threads.sleep(1000); // Let the kill soak in.
>   util.waitUntilAllRegionsAssigned(tableName);
>   LOG.info("All regions assigned");
>   verifyMethodResult(SimpleRegionObserver.class,
> new String[] { "getCtPreReplayWALs", "getCtPostReplayWALs", 
> "getCtPreWALRestore",
> "getCtPostWALRestore", "getCtPrePut", "getCtPostPut" },
> tableName, new Integer[] { 1, 1, 2, 2, 0, 0 });
> } finally {
>   util.deleteTable(tableName);
>   table.close();
> }
>   }
> {noformat}
> However, looking at test logs, found that we had overlapping Assigns with 
> Unassigns. As a result, regions ended up 'stuck in RIT' and the test timeout.
> Assigns were from the ServerCrashRecovery and Unassigns were from the 
> deleteTable cleanup.
> Which begs the question, why did HBTU.waitUntilAllRegionsAssigned(tableName) 
> not wait until recovery was complete.
> Answer: Looks like that function is only meant for sunny scenarios but not 
> for crashes. It iterates over meta and just [checks for *some value* in the 
> server 
> column|https://github.com/apache/hbase/blob/cdc2bb17ff38dcbd273cf501aea565006e995a06/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java#L3421]
>  which is obviously present and equal to the server that was just killed.
> This bug must be affecting other fault tolerance tests too and fixing it may 
> fix more than just one test, hopefully.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19112) Suspect methods on Cell to be deprecated

2017-11-27 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267859#comment-16267859
 ] 

Chia-Ping Tsai commented on HBASE-19112:


bq. Are you talking of deprecating Put#add(Cell) in hbase-client, 
yep. If user want to add the custom cell, the {{RawCell}} is what they should 
implement as we have deprecated {{getTypeByte}} in Cell. 

> Suspect methods on Cell to be deprecated
> 
>
> Key: HBASE-19112
> URL: https://issues.apache.org/jira/browse/HBASE-19112
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Josh Elser
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19112_branch-2.patch
>
>
> [~chia7712] suggested on the [mailing 
> list|https://lists.apache.org/thread.html/e6de9af26d9b888a358ba48bf74655ccd893573087c032c0fcf01585@%3Cdev.hbase.apache.org%3E]
>  that we have some methods on Cell which should be deprecated for removal:
> * {{#getType()}}
> * {{#getTimestamp()}}
> * {{#getTag()}}
> * {{#getSequenceId()}}
> Let's make a pass over these (and maybe the rest) to make sure that there 
> aren't others which are either implementation details or methods returning 
> now-private-marked classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19319) Fix bug in synchronizing over ProcedureEvent

2017-11-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267850#comment-16267850
 ] 

Hudson commented on HBASE-19319:


FAILURE: Integrated in Jenkins build HBase-2.0 #926 (See 
[https://builds.apache.org/job/HBase-2.0/926/])
HBASE-19319 Fix bug in synchronizing over ProcedureEvent (appy: rev 
96e63ac7b8c91bb505b7a0e276df7a4cd1246542)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestMasterProcedureScheduler.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureEnv.java
* (edit) 
hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/TestProcedureEvents.java
* (edit) 
hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/TestProcedureSchedulerConcurrency.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMergeTransactionOnCluster.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/locking/LockProcedure.java
* (edit) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureEvent.java
* (edit) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureScheduler.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestMasterProcedureEvents.java
* (edit) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/AbstractProcedureScheduler.java


> Fix bug in synchronizing over ProcedureEvent
> 
>
> Key: HBASE-19319
> URL: https://issues.apache.org/jira/browse/HBASE-19319
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19319.master.001.patch, 
> HBASE-19319.master.002.patch
>
>
> Following synchronizes over local variable rather than the original 
> ProcedureEvent object. Clearly a bug since this code block won't follow 
> exclusion with many of the synchronized methods in ProcedureEvent class.
> {code}
>  @Override
>   public void wakeEvents(final int count, final ProcedureEvent... events) {
> final boolean traceEnabled = LOG.isTraceEnabled();
> schedLock();
> try {
>   int waitingCount = 0;
>   for (int i = 0; i < count; ++i) {
> final ProcedureEvent event = events[i];
> synchronized (event) {
>   if (!event.isReady()) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   3   >