[jira] [Updated] (HBASE-21164) reportForDuty to spew less log if master is initializing

2018-09-13 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HBASE-21164:
--
Attachment: HBASE-21164.008.patch

> reportForDuty to spew less log if master is initializing
> 
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.005.patch, HBASE-21164.006.patch, 
> HBASE-21164.007.patch, HBASE-21164.008.patch, 
> HBASE-21164.branch-2.1.001.patch, HBASE-21164.branch-2.1.002.patch, 
> HBASE-21164.branch-2.1.003.patch, HBASE-21164.branch-2.1.004.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. We should spew less those logs. Here 
> is example:
> {code:java}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message – every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21164) reportForDuty to spew less log if master is initializing

2018-09-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614388#comment-16614388
 ] 

Hadoop QA commented on HBASE-21164:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
47s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
11s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} hbase-common: The patch generated 0 new + 2 
unchanged - 1 fixed = 2 total (was 3) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
8s{color} | {color:red} hbase-server: The patch generated 3 new + 230 unchanged 
- 0 fixed = 233 total (was 230) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 7s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  5m 
53s{color} | {color:red} The patch causes 10 errors with Hadoop v3.0.0. {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
43s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}217m 
24s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}263m  8s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21164 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939636/HBASE-21164.006.patch 
|
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux cc00e5c51b84 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Updated] (HBASE-21164) reportForDuty to spew less log if master is initializing

2018-09-13 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HBASE-21164:
--
Description: 
RegionServers do reportForDuty on startup to tell Master they are available. If 
Master is initializing, and especially on a big cluster when it can take a 
while particularly if something is amiss, the log every three seconds is 
annoying and doesn't do anything of use. We should spew less those logs. Here 
is example:
{code:java}
2018-09-06 14:01:39,312 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
startcode=1536266763109
2018-09-06 14:01:39,312 WARN 
org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
sleeping and then retrying.

{code}
For example, I am looking at a large cluster now that had a backlog of 
procedure WALs. It is taking a couple of hours recreating the procedure-state 
because there are millions of procedures outstanding. Meantime, the Master log 
is just full of the above message – every three seconds...

  was:
RegionServers do reportForDuty on startup to tell Master they are available. If 
Master is initializing, and especially on a big cluster when it can take a 
while particularly if something is amiss, the log every three seconds is 
annoying and doesn't do anything of use. Do backoff if fails up to a reasonable 
maximum period. Here is example:

{code}
2018-09-06 14:01:39,312 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
startcode=1536266763109
2018-09-06 14:01:39,312 WARN 
org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
sleeping and then retrying.

{code}

For example, I am looking at a large cluster now that had a backlog of 
procedure WALs. It is taking a couple of hours recreating the procedure-state 
because there are millions of procedures outstanding. Meantime, the Master log 
is just full of the above message -- every three seconds...


> reportForDuty to spew less log if master is initializing
> 
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.005.patch, HBASE-21164.006.patch, 
> HBASE-21164.007.patch, HBASE-21164.branch-2.1.001.patch, 
> HBASE-21164.branch-2.1.002.patch, HBASE-21164.branch-2.1.003.patch, 
> HBASE-21164.branch-2.1.004.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. We should spew less those logs. Here 
> is example:
> {code:java}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message – every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21164) reportForDuty to spew less log if master is initializing

2018-09-13 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HBASE-21164:
--
Summary: reportForDuty to spew less log if master is initializing  (was: 
reportForDuty should do (expotential) backoff rather than retry every 3 seconds 
(default).)

> reportForDuty to spew less log if master is initializing
> 
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.005.patch, HBASE-21164.006.patch, 
> HBASE-21164.007.patch, HBASE-21164.branch-2.1.001.patch, 
> HBASE-21164.branch-2.1.002.patch, HBASE-21164.branch-2.1.003.patch, 
> HBASE-21164.branch-2.1.004.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-13 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614383#comment-16614383
 ] 

Mingliang Liu commented on HBASE-21164:
---

V7 patch to address Allan's concern. Refactoring Sleeper seems not necessary. 
We can remove the unit test if it's not necessary either, as the change is not 
as major as previous version.

> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.005.patch, HBASE-21164.006.patch, 
> HBASE-21164.007.patch, HBASE-21164.branch-2.1.001.patch, 
> HBASE-21164.branch-2.1.002.patch, HBASE-21164.branch-2.1.003.patch, 
> HBASE-21164.branch-2.1.004.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost

2018-09-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614382#comment-16614382
 ] 

stack commented on HBASE-21035:
---

I've been trying to make basic progress on HBCK2. I pushed up an HBCK2 tool 
that can call our only Hbck method over in the hbase-operator-tools project: 
https://github.com/apache/hbase-operator-tools/commit/0cf0e0ecf2d4a33522e0e273f9310f11aa2eaee6.
 It is missing so much -- test, how to package, how to pass in pointer to the 
cluster to fix, doc., etc., but I'm working on it.

Next is adding assign and bulk assign to Hbck Service. This Hbck assign will be 
different to Admin Assign in that it should work even though the Master is 
'initializing' (Admin assign fails because we check master state before we do 
anything -- which makes it so can't schedule meta assign if it offlined). The 
hbck assign bypass stuff like calling CPs too. I also want bulk assign -- i.e. 
passing a thousand regions at a time to assign -- because when doing repairs, 
clusters will probably be big with lots of regions in odd states. I've been 
running a fixup job on a cluster where I have thousands of regions in OPENING 
state (I removed the Master WAL Procs after crashing it... ). Doing assigns one 
at a time on the command-line doesn't cut it... It takes from 10-40 seconds per 
assign.

> Meta Table should be able to online even if all procedures are lost
> ---
>
> Key: HBASE-21035
> URL: https://issues.apache.org/jira/browse/HBASE-21035
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21035.branch-2.0.001.patch, 
> HBASE-21035.branch-2.1.001.patch
>
>
> After HBASE-20708, we changed the way we init after master starts. It will 
> only check WAL dirs and compare to Zookeeper RS nodes to decide which server 
> need to expire. For servers which's dir is ending with 'SPLITTING', we assure 
> that there will be a SCP for it.
> But, if the server with the meta region crashed before master restarts, and 
> if all the procedure wals are lost (due to bug, or deleted manually, 
> whatever), the new restarted master will be stuck when initing. Since no one 
> will bring meta region online.
> Although it is an anomaly case, but I think no matter what happens, we need 
> to online meta region. Otherwise, we are sitting ducks, noting can be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-13 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HBASE-21164:
--
Attachment: HBASE-21164.007.patch

> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.005.patch, HBASE-21164.006.patch, 
> HBASE-21164.007.patch, HBASE-21164.branch-2.1.001.patch, 
> HBASE-21164.branch-2.1.002.patch, HBASE-21164.branch-2.1.003.patch, 
> HBASE-21164.branch-2.1.004.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21102) ServerCrashProcedure should select target server where no other replicas exist for the current region

2018-09-13 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614380#comment-16614380
 ] 

ramkrishna.s.vasudevan commented on HBASE-21102:


Oh I see. Some one has raised an issue for the failures. I am just seeing it. 

> ServerCrashProcedure should select target server where no other replicas 
> exist for the current region
> -
>
> Key: HBASE-21102
> URL: https://issues.apache.org/jira/browse/HBASE-21102
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 3.0.0, 2.2.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Major
> Attachments: HBASE-21102_1.patch, HBASE-21102_2.patch, 
> HBASE-21102_3.patch, HBASE-21102_4.patch, HBASE-21102_initial.patch
>
>
> Currently when a server with region replica crashes, when the target server 
> is created for the replica region assignment there is no guarentee that a 
> server is selected where there is no other replica for the current region 
> getting assigned. It so happens that currently we do an assignment randomly 
> and later the LB comes and identifies these cases and again does MOVE for 
> such regions. It will be better if we can identify target servers at least 
> minimally ensuring that replicas are not colocated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21178) Get and Scan operation with converter_class not working

2018-09-13 Thread Subrat Mishra (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614377#comment-16614377
 ] 

Subrat Mishra commented on HBASE-21178:
---

{quote}Which version break it? U know which jira ? 
{quote}
Jira id: HBASE-18067 and version 2.0.0

> Get and Scan operation with converter_class not working
> ---
>
> Key: HBASE-21178
> URL: https://issues.apache.org/jira/browse/HBASE-21178
> Project: HBase
>  Issue Type: Bug
>Reporter: Subrat Mishra
>Assignee: Subrat Mishra
>Priority: Major
> Attachments: HBASE-21178.master.001.patch
>
>
> Consider a simple scenario:
> {code:java}
> create 'foo', {NAME => 'f1'}
> put 'foo','r1','f1:a',1000
> get 'foo','r1',{COLUMNS => 
> ['f1:a:c(org.apache.hadoop.hbase.util.Bytes).len']} 
> scan 'foo',{COLUMNS => 
> ['f1:a:c(org.apache.hadoop.hbase.util.Bytes).len']}{code}
> Both get and scan fails with ERROR
> {code:java}
> ERROR: wrong number of arguments (3 for 1) {code}
> Looks like in table.rb file converter_method expects 3 arguments [(bytes, 
> offset, len)] since version 2.0.0, prior to version 2.0.0 it was taking only 
> 1 argument [(bytes)]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20993) [Auth] IPC client fallback to simple auth allowed doesn't work

2018-09-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614374#comment-16614374
 ] 

Hadoop QA commented on HBASE-20993:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
42s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
10s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
17s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  5m 
 3s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
 4s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
59s{color} | {color:green} root: The patch generated 0 new + 81 unchanged - 1 
fixed = 81 total (was 82) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} xml {color} | {color:red}  0m  0s{color} | 
{color:red} The patch has 1 ill-formed XML file(s). {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
59s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
1m 46s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
10s{color} | {color:green} hbase-checkstyle in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
32s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}107m 
49s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | 

[jira] [Commented] (HBASE-21102) ServerCrashProcedure should select target server where no other replicas exist for the current region

2018-09-13 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614373#comment-16614373
 ] 

Duo Zhang commented on HBASE-21102:
---

HBASE-21197 is for this problem? I think we can resolve this issue first and 
start working on fixing the UT there. After that we can open backport issues.

> ServerCrashProcedure should select target server where no other replicas 
> exist for the current region
> -
>
> Key: HBASE-21102
> URL: https://issues.apache.org/jira/browse/HBASE-21102
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 3.0.0, 2.2.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Major
> Attachments: HBASE-21102_1.patch, HBASE-21102_2.patch, 
> HBASE-21102_3.patch, HBASE-21102_4.patch, HBASE-21102_initial.patch
>
>
> Currently when a server with region replica crashes, when the target server 
> is created for the replica region assignment there is no guarentee that a 
> server is selected where there is no other replica for the current region 
> getting assigned. It so happens that currently we do an assignment randomly 
> and later the LB comes and identifies these cases and again does MOVE for 
> such regions. It will be better if we can identify target servers at least 
> minimally ensuring that replicas are not colocated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21102) ServerCrashProcedure should select target server where no other replicas exist for the current region

2018-09-13 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614372#comment-16614372
 ] 

ramkrishna.s.vasudevan commented on HBASE-21102:


[~Apache9]
I already downloaded the artificats from the build and verified the logs. The 
existing logs does not give much info. When I wanted to fix the flakiness 
before I committed the patch I added intermediatory logs and only then found 
that the randomAssignment() was having some issues. Since in my local cluster 
the test passes consistently I would like to push in some Logs and then rerun 
the tests. Probably in precommits. If pre-commit does not fail then let me push 
those logs and then generate a build. But one question I have is will the 
flakey test re run after the commit with Log msgs added is done?

> ServerCrashProcedure should select target server where no other replicas 
> exist for the current region
> -
>
> Key: HBASE-21102
> URL: https://issues.apache.org/jira/browse/HBASE-21102
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 3.0.0, 2.2.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Major
> Attachments: HBASE-21102_1.patch, HBASE-21102_2.patch, 
> HBASE-21102_3.patch, HBASE-21102_4.patch, HBASE-21102_initial.patch
>
>
> Currently when a server with region replica crashes, when the target server 
> is created for the replica region assignment there is no guarentee that a 
> server is selected where there is no other replica for the current region 
> getting assigned. It so happens that currently we do an assignment randomly 
> and later the LB comes and identifies these cases and again does MOVE for 
> such regions. It will be better if we can identify target servers at least 
> minimally ensuring that replicas are not colocated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-13 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614360#comment-16614360
 ] 

Mingliang Liu commented on HBASE-21164:
---

I assumed the slow down of master starting up was acceptable for this case and 
I favored the backoff. Unlike cluster shutdown(), the RS sleeper can not be 
waken if master is ready early. The up to 1min is unfortunately unavoidable in 
the worst case.
{quote}We'd just turn off the log spew?
{quote}
My initial idea was to dump log every dozens of retries (say, 100, so 3s*100 = 
5min). We go with that?

> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.005.patch, HBASE-21164.006.patch, 
> HBASE-21164.branch-2.1.001.patch, HBASE-21164.branch-2.1.002.patch, 
> HBASE-21164.branch-2.1.003.patch, HBASE-21164.branch-2.1.004.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-09-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614317#comment-16614317
 ] 

stack commented on HBASE-20952:
---

Where is the 'design doc' that we're talking about? Is it the google doc 
attached in middle of this JIRA? The overview? If so, I was wondering if this 
doc was going to get a revision? Seems like plenty of questions and 
back-and-forth above that might get consideration and that might have an impact 
on the API and on general subsystem thinking? (Lets add link to this doc up at 
the top of this issue)

I like the [~Apache9] questions. Is it that he's done more homework that he has 
these questions or that he just has a better understanding of how the system 
works? IMO, it tends to be easier working through concerns in a design than in 
comments in JIRA/RB and or in code review; the latter tends to get distributed 
all over and moving code without the high-level figured can bring on myopia. Is 
the 'design doc' we talk of above the place to work through his concerns or is 
that somewhere else?

Thanks

> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
> Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-15631) Backport Regionserver Groups (HBASE-6721) to branch-1

2018-09-13 Thread loushang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614314#comment-16614314
 ] 

loushang commented on HBASE-15631:
--

Hi, any configuration suggestion or example for this patch to be used at branch 
1-4?
 It seems that the configuration in HBASE-6721 does not work here.
 greate appreciation!

> Backport Regionserver Groups (HBASE-6721) to branch-1 
> --
>
> Key: HBASE-15631
> URL: https://issues.apache.org/jira/browse/HBASE-15631
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 1.4.0
>Reporter: Francis Liu
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.4.0
>
> Attachments: HBASE-15631-branch-1-addendum.patch, 
> HBASE-15631-branch-1.patch, HBASE-15631-branch-1.patch, 
> HBASE-15631-branch-1.patch, HBASE-15631.branch-1.patch, HBASE-15631.patch
>
>
> Based on dev list discussion backporting region server group should not be an 
> issue as it does not: 1. destabilize the code. 2. cause backward 
> incompatibility. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614306#comment-16614306
 ] 

stack commented on HBASE-21164:
---

This is a fair point [~allan163]. So, RS should just continue to bang on the 
Master every three seconds? We'd just turn off the log spew?

> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.005.patch, HBASE-21164.006.patch, 
> HBASE-21164.branch-2.1.001.patch, HBASE-21164.branch-2.1.002.patch, 
> HBASE-21164.branch-2.1.003.patch, HBASE-21164.branch-2.1.004.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-09-13 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614292#comment-16614292
 ] 

Duo Zhang commented on HBASE-20952:
---

API is not the first thing to decide. As I said above, the first thing is we 
need to know the overall solution. You can see our design doc for serial 
replication and sync replication

https://docs.google.com/document/d/1LHC3IRUc5i2V4_roNw8BDAOKGM4bEapR_hefpZxDT00/edit

https://docs.google.com/document/d/193D3aOxD-muPIZuQfI4Zo3_qg6-Nepeu_kraYJVQkiE/edit#heading=h.e8l9k556m3wi

There is no API design in it, but we try our best to describe how we plan to do 
it in HBase.

{quote}
This is good; I hadn't thought about abstracting out fencing. We should have 
API which pushes this fencing impl down into the Provider. For the Ratis 
LogService, we designed api to be able to close() a Log; make it read-only. In 
the context of HBase, we would close the Log before we start 
recovery/re-assignment, and have the net-effect of preventing any half-dead RS 
from continuing to try to add more edits to the Log. This effectively would 
work like recoverLease() does now for the HDFS case.
{quote}

Yes this is what I really want to discuss, not something like whether we should 
use WALInfo or WALIdentity.

The information you described is still not enough to solve all the problems. In 
the old time we will roll the wal writer, and it is done by RS, so closing the 
wal file is not enough, as the RS will try to open a new one and write to it. 
That's why we need to rename the wal directory. In your words above, it seems 
to me that we will only have one stream opened forever for a RS, then how do we 
drop the old edits after flush? And how do we setup the wal stream? Only once 
at the RS start up? And if there are errors later, we just abort? Without 
trying to recover or open a new stream? Or it will be handled by ratis? And for 
the FileSystem, we will use multi wal to increase the performance, and the 
logic is messed up with WALProvider. Does ratis still need multi wal to 
increase the performance? And if not, what's the plan? We need to refactor the 
multi wal related code, to not work against the WALProvider but something with 
the FileSystem related stuffs directly?

For the sync replication thing, it is just a DualAsyncWriter, which writes to 
two HDFS clusters at once, I think it is possible to write to other log 
systems, such as ratis, if you still share the AsyncWriter interface. The 
problem here is that how to describe the place where we write the remote wals. 
For FileSystem based wals, it is just a directory on a remote cluster, for 
example, "hdfs://cluster-name/path". We need to find a way to describe other 
log systems.

> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
> Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-09-13 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614288#comment-16614288
 ] 

Reid Chan commented on HBASE-20734:
---

I think it is ok to go.
Please provide a patch for branch-1 and ping Andrew to take a look.

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20734.branch-1.001.patch, 
> HBASE-20734.branch-1.002.patch, HBASE-20734.branch-1.003.patch, 
> HBASE-20734.branch-1.004.patch, HBASE-20734.master.001.patch, 
> HBASE-20734.master.002.patch, HBASE-20734.master.003.patch, 
> HBASE-20734.master.004.patch, HBASE-20734.master.005.patch, 
> HBASE-20734.master.006.patch, HBASE-20734.master.007.patch, 
> HBASE-20734.master.008.patch, HBASE-20734.master.009.patch, 
> HBASE-20734.master.010.patch, HBASE-20734.master.011.patch, 
> HBASE-20734.master.012.patch
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21188) Print heap and gc informations in our junit ResourceChecker

2018-09-13 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614280#comment-16614280
 ] 

Duo Zhang commented on HBASE-21188:
---

Just want to confirm that whether GC is a problem for our slow test. At least 
GC count and GC time should not be considered as a 'Resource'. Will revert the 
patch later, as it seems that GC is not the actual problem.

> Print heap and gc informations in our junit ResourceChecker
> ---
>
> Key: HBASE-21188
> URL: https://issues.apache.org/jira/browse/HBASE-21188
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21188.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21160) Assertion in TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels is ignored

2018-09-13 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614279#comment-16614279
 ] 

Ted Yu commented on HBASE-21160:


As I said above, when there is no assertion at the end of try block, you don't 
need to make change.

Please also keep the try-with-resources structure which releases resource.

> Assertion in 
> TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels
>  is ignored
> ---
>
> Key: HBASE-21160
> URL: https://issues.apache.org/jira/browse/HBASE-21160
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: liubangchen
>Priority: Trivial
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14327/artifact/patchprocess/diff-compile-javac-hbase-server.txt
>  (HBASE-21138 QA run):
> {code}
> [WARNING] 
> /testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDeletes.java:[315,25]
>  [AssertionFailureIgnored] This assertion throws an AssertionError if it 
> fails, which will be caught by an enclosing try block.
> {code}
> Here is related code:
> {code}
>   PrivilegedExceptionAction scanAction = new 
> PrivilegedExceptionAction() {
> @Override
> public Void run() throws Exception {
>   try (Connection connection = 
> ConnectionFactory.createConnection(conf);
> ...
> assertEquals(1, next.length);
>   } catch (Throwable t) {
> throw new IOException(t);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21112) [Auth] IPC client fallback to simple auth (forward-port to master)

2018-09-13 Thread Reid Chan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reid Chan updated HBASE-21112:
--
Fix Version/s: 2.2.0
   3.0.0

> [Auth] IPC client fallback to simple auth (forward-port to master)
> --
>
> Key: HBASE-21112
> URL: https://issues.apache.org/jira/browse/HBASE-21112
> Project: HBase
>  Issue Type: Bug
>Reporter: Jack Bearden
>Assignee: Jack Bearden
>Priority: Critical
>  Labels: master
> Fix For: 3.0.0, 2.2.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21112) [Auth] IPC client fallback to simple auth (forward-port to master)

2018-09-13 Thread Reid Chan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reid Chan reopened HBASE-21112:
---

> [Auth] IPC client fallback to simple auth (forward-port to master)
> --
>
> Key: HBASE-21112
> URL: https://issues.apache.org/jira/browse/HBASE-21112
> Project: HBase
>  Issue Type: Bug
>Reporter: Jack Bearden
>Assignee: Jack Bearden
>Priority: Critical
>  Labels: master
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21160) Assertion in TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels is ignored

2018-09-13 Thread liubangchen (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614271#comment-16614271
 ] 

liubangchen edited comment on HBASE-21160 at 9/14/18 2:44 AM:
--

Hi [~yuzhih...@gmail.com] code review link is here 
[reviews-68699|https://reviews.apache.org/r/68699/]


was (Author: liubangchen):
Hi [~yuzhih...@gmail.com] code review link is here 
[#https://reviews.apache.org/r/68699/]

> Assertion in 
> TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels
>  is ignored
> ---
>
> Key: HBASE-21160
> URL: https://issues.apache.org/jira/browse/HBASE-21160
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: liubangchen
>Priority: Trivial
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14327/artifact/patchprocess/diff-compile-javac-hbase-server.txt
>  (HBASE-21138 QA run):
> {code}
> [WARNING] 
> /testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDeletes.java:[315,25]
>  [AssertionFailureIgnored] This assertion throws an AssertionError if it 
> fails, which will be caught by an enclosing try block.
> {code}
> Here is related code:
> {code}
>   PrivilegedExceptionAction scanAction = new 
> PrivilegedExceptionAction() {
> @Override
> public Void run() throws Exception {
>   try (Connection connection = 
> ConnectionFactory.createConnection(conf);
> ...
> assertEquals(1, next.length);
>   } catch (Throwable t) {
> throw new IOException(t);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20993) [Auth] IPC client fallback to simple auth allowed doesn't work

2018-09-13 Thread Reid Chan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reid Chan updated HBASE-20993:
--
Fix Version/s: (was: 2.2.0)
   (was: 3.0.0)

> [Auth] IPC client fallback to simple auth allowed doesn't work
> --
>
> Key: HBASE-20993
> URL: https://issues.apache.org/jira/browse/HBASE-20993
> Project: HBase
>  Issue Type: Bug
>  Components: Client, security
>Affects Versions: 1.2.6
>Reporter: Reid Chan
>Assignee: Jack Bearden
>Priority: Critical
> Fix For: 1.5.0, 1.4.8
>
> Attachments: HBASE-20993.001.patch, 
> HBASE-20993.003.branch-1.flowchart.png, HBASE-20993.branch-1.002.patch, 
> HBASE-20993.branch-1.003.patch, HBASE-20993.branch-1.004.patch, 
> HBASE-20993.branch-1.005.patch, HBASE-20993.branch-1.006.patch, 
> HBASE-20993.branch-1.007.patch, HBASE-20993.branch-1.008.patch, 
> HBASE-20993.branch-1.009.patch, HBASE-20993.branch-1.009.patch, 
> HBASE-20993.branch-1.2.001.patch, HBASE-20993.branch-1.wip.002.patch, 
> HBASE-20993.branch-1.wip.patch, yetus-local-testpatch-output-009.txt
>
>
> It is easily reproducible.
> client's hbase-site.xml: hadoop.security.authentication:kerberos, 
> hbase.security.authentication:kerberos, 
> hbase.ipc.client.fallback-to-simple-auth-allowed:true, keytab and principal 
> are right set
> A simple auth hbase cluster, a kerberized hbase client application. 
> application trying to r/w/c/d table will have following exception:
> {code}
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
>   at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
>   at 
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:617)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$700(RpcClientImpl.java:162)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:743)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:740)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:740)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:906)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:873)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1241)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:58383)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(ConnectionManager.java:1592)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1530)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1552)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1581)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1738)
>   at 
> org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4297)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4289)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsyncV2(HBaseAdmin.java:753)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:674)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:607)
>   at 
> org.playground.hbase.KerberizedClientFallback.main(KerberizedClientFallback.java:55)
> 

[jira] [Commented] (HBASE-20993) [Auth] IPC client fallback to simple auth allowed doesn't work

2018-09-13 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614272#comment-16614272
 ] 

Reid Chan commented on HBASE-20993:
---

Thanks for the info Jack, just enjoy your time, not in a rush.

Trigger branch-1 v9 again, if qa +1, we should let it go first.
I'll reopen the jira for port master branch.

> [Auth] IPC client fallback to simple auth allowed doesn't work
> --
>
> Key: HBASE-20993
> URL: https://issues.apache.org/jira/browse/HBASE-20993
> Project: HBase
>  Issue Type: Bug
>  Components: Client, security
>Affects Versions: 1.2.6
>Reporter: Reid Chan
>Assignee: Jack Bearden
>Priority: Critical
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8
>
> Attachments: HBASE-20993.001.patch, 
> HBASE-20993.003.branch-1.flowchart.png, HBASE-20993.branch-1.002.patch, 
> HBASE-20993.branch-1.003.patch, HBASE-20993.branch-1.004.patch, 
> HBASE-20993.branch-1.005.patch, HBASE-20993.branch-1.006.patch, 
> HBASE-20993.branch-1.007.patch, HBASE-20993.branch-1.008.patch, 
> HBASE-20993.branch-1.009.patch, HBASE-20993.branch-1.009.patch, 
> HBASE-20993.branch-1.2.001.patch, HBASE-20993.branch-1.wip.002.patch, 
> HBASE-20993.branch-1.wip.patch, yetus-local-testpatch-output-009.txt
>
>
> It is easily reproducible.
> client's hbase-site.xml: hadoop.security.authentication:kerberos, 
> hbase.security.authentication:kerberos, 
> hbase.ipc.client.fallback-to-simple-auth-allowed:true, keytab and principal 
> are right set
> A simple auth hbase cluster, a kerberized hbase client application. 
> application trying to r/w/c/d table will have following exception:
> {code}
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
>   at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
>   at 
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:617)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$700(RpcClientImpl.java:162)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:743)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:740)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:740)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:906)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:873)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1241)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:58383)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(ConnectionManager.java:1592)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1530)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1552)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1581)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1738)
>   at 
> org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4297)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4289)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsyncV2(HBaseAdmin.java:753)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:674)
>   at 
> 

[jira] [Commented] (HBASE-21160) Assertion in TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels is ignored

2018-09-13 Thread liubangchen (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614271#comment-16614271
 ] 

liubangchen commented on HBASE-21160:
-

Hi [~yuzhih...@gmail.com] code review link is here 
[#https://reviews.apache.org/r/68699/]

> Assertion in 
> TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels
>  is ignored
> ---
>
> Key: HBASE-21160
> URL: https://issues.apache.org/jira/browse/HBASE-21160
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: liubangchen
>Priority: Trivial
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14327/artifact/patchprocess/diff-compile-javac-hbase-server.txt
>  (HBASE-21138 QA run):
> {code}
> [WARNING] 
> /testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDeletes.java:[315,25]
>  [AssertionFailureIgnored] This assertion throws an AssertionError if it 
> fails, which will be caught by an enclosing try block.
> {code}
> Here is related code:
> {code}
>   PrivilegedExceptionAction scanAction = new 
> PrivilegedExceptionAction() {
> @Override
> public Void run() throws Exception {
>   try (Connection connection = 
> ConnectionFactory.createConnection(conf);
> ...
> assertEquals(1, next.length);
>   } catch (Throwable t) {
> throw new IOException(t);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20993) [Auth] IPC client fallback to simple auth allowed doesn't work

2018-09-13 Thread Reid Chan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reid Chan updated HBASE-20993:
--
Attachment: HBASE-20993.branch-1.009.patch

> [Auth] IPC client fallback to simple auth allowed doesn't work
> --
>
> Key: HBASE-20993
> URL: https://issues.apache.org/jira/browse/HBASE-20993
> Project: HBase
>  Issue Type: Bug
>  Components: Client, security
>Affects Versions: 1.2.6
>Reporter: Reid Chan
>Assignee: Jack Bearden
>Priority: Critical
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8
>
> Attachments: HBASE-20993.001.patch, 
> HBASE-20993.003.branch-1.flowchart.png, HBASE-20993.branch-1.002.patch, 
> HBASE-20993.branch-1.003.patch, HBASE-20993.branch-1.004.patch, 
> HBASE-20993.branch-1.005.patch, HBASE-20993.branch-1.006.patch, 
> HBASE-20993.branch-1.007.patch, HBASE-20993.branch-1.008.patch, 
> HBASE-20993.branch-1.009.patch, HBASE-20993.branch-1.009.patch, 
> HBASE-20993.branch-1.2.001.patch, HBASE-20993.branch-1.wip.002.patch, 
> HBASE-20993.branch-1.wip.patch, yetus-local-testpatch-output-009.txt
>
>
> It is easily reproducible.
> client's hbase-site.xml: hadoop.security.authentication:kerberos, 
> hbase.security.authentication:kerberos, 
> hbase.ipc.client.fallback-to-simple-auth-allowed:true, keytab and principal 
> are right set
> A simple auth hbase cluster, a kerberized hbase client application. 
> application trying to r/w/c/d table will have following exception:
> {code}
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
>   at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
>   at 
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:617)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$700(RpcClientImpl.java:162)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:743)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:740)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:740)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:906)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:873)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1241)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:58383)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(ConnectionManager.java:1592)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1530)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1552)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1581)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1738)
>   at 
> org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4297)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4289)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsyncV2(HBaseAdmin.java:753)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:674)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:607)
>   at 
> org.playground.hbase.KerberizedClientFallback.main(KerberizedClientFallback.java:55)
> Caused by: 

[jira] [Resolved] (HBASE-9469) Synchronous replication

2018-09-13 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-9469.
--
Resolution: Duplicate

> Synchronous replication
> ---
>
> Key: HBASE-9469
> URL: https://issues.apache.org/jira/browse/HBASE-9469
> Project: HBase
>  Issue Type: New Feature
>Reporter: Honghua Feng
>Priority: Major
>
> Scenario: 
> A/B clusters with master-master replication, client writes to A cluster and A 
> pushes all writes to B cluster, and when A cluster is down, client switches 
> writing to B cluster.
> But the client's write switch is unsafe due to the replication between A/B is 
> asynchronous: a delete to B cluster which aims to delete a put written 
> earlier can fail due to that put is written to A cluster and isn't 
> successfully pushed to B before A is down. It can be worse if this delete is 
> collected(flush and then major compact occurs) before A cluster is up and 
> that put is eventually pushed to B, the put won't ever be deleted.
> Can we provide per-table/per-peer synchronous replication which ships the 
> according hlog entry of write before responsing write success to client? By 
> this we can guarantee the client that all write requests for which he got 
> success response when he wrote to A cluster must already have been in B 
> cluster as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-9469) Synchronous replication

2018-09-13 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang reopened HBASE-9469:
--

> Synchronous replication
> ---
>
> Key: HBASE-9469
> URL: https://issues.apache.org/jira/browse/HBASE-9469
> Project: HBase
>  Issue Type: New Feature
>Reporter: Honghua Feng
>Priority: Major
>
> Scenario: 
> A/B clusters with master-master replication, client writes to A cluster and A 
> pushes all writes to B cluster, and when A cluster is down, client switches 
> writing to B cluster.
> But the client's write switch is unsafe due to the replication between A/B is 
> asynchronous: a delete to B cluster which aims to delete a put written 
> earlier can fail due to that put is written to A cluster and isn't 
> successfully pushed to B before A is down. It can be worse if this delete is 
> collected(flush and then major compact occurs) before A cluster is up and 
> that put is eventually pushed to B, the put won't ever be deleted.
> Can we provide per-table/per-peer synchronous replication which ships the 
> according hlog entry of write before responsing write success to client? By 
> this we can guarantee the client that all write requests for which he got 
> success response when he wrote to A cluster must already have been in B 
> cluster as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-09-13 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614257#comment-16614257
 ] 

Josh Elser commented on HBASE-20952:


Good questions! Thanks for taking the time to write them, Duo.
{quote}How do we do fencing when RS crashes? Now we need to rename the wal 
directory for a RS, and then call recoverLease for all the files to confirm 
that they are all closed. And at RS side, when creating a wal write, we use 
createNonRecursive intentionally, so that if the wal directory has been 
renamed, we can not create wal writers any more. How do we want to abstract 
these operations in the new WAL API? How does other log systems, such as ratis, 
deal with this?
{quote}
This is good; I hadn't thought about abstracting out fencing. We should have 
API which pushes this fencing impl down into the Provider. For the Ratis 
LogService, we designed api to be able to {{close()}} a Log; make it read-only. 
In the context of HBase, we would close the Log before we start 
recovery/re-assignment, and have the net-effect of preventing any half-dead RS 
from continuing to try to add more edits to the Log. This effectively would 
work like recoverLease() does now for the HDFS case.
{quote}For sync replication, we have a config called remote wal directory, 
which exposes the file system to user. As it is implemented by us at Xiaomi, we 
can help to find a work around on this.
{quote}
Ok. I'm definitely dense here :). Do you have a pointer to some code to look 
at? Or, based on my previous, is a solution obvious to you?
{quote}looking at the code on the RB, we have already started to change the 
stuffs in replication? And for RecoveredReplicationSource, we make it abstract 
and introduce a new FSRecoveredReplicationSource? Then where is the 
FSReplicationSource?
{quote}
There is a second RB open which has a much-reduced version of that original 
patch. Looks like this might not have gotten attached to this Jira issue (oops, 
will make sure that's linked).

[https://reviews.apache.org/r/68672]

This should help give a much smaller view of API only. Trying to make some of 
the other "systems" using WALs work with a new API was a good exercise to make 
sure we didn't miss something obvious. Totally in agreement that we want a good 
API before we start throwing out implementation.

> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
> Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation

2018-09-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614238#comment-16614238
 ] 

Hadoop QA commented on HBASE-21196:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
51s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
59s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 35s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
19s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}243m 14s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}292m 18s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.replication.TestReplicationKillSlaveRSWithSeparateOldWALs |
|   | hadoop.hbase.replication.TestReplicationKillSlaveRS |
|   | hadoop.hbase.client.TestSnapshotTemporaryDirectoryWithRegionReplicas |
|   | hadoop.hbase.replication.TestReplicationSmallTests |
|   | hadoop.hbase.replication.TestReplicationSmallTestsSync |
|   | hadoop.hbase.client.TestRegionLocationCaching |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21196 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939613/HBASE-21196.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  

[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-09-13 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614232#comment-16614232
 ] 

Duo Zhang commented on HBASE-20952:
---

The design doc does not help, it is just like pseudo-code. What I want to know 
is that, how do we deal with several key problems if we want to remove the 
direct dependency on FileSystem. There is a simple list that comes immediately 
to my mind:

1. How do we do fencing when RS crashes? Now we need to rename the wal 
directory for a RS, and then call recoverLease for all the files to confirm 
that they are all closed. And at RS side, when creating a wal write, we use 
createNonRecursive intentionally, so that if the wal directory has been 
renamed, we can not create wal writers any more. How do we want to abstract 
these operations in the new WAL API? How does other log systems, such as ratis, 
deal with this?

2. For sync replication, we have a config called remote wal directory, which 
exposes the file system to user. As it is implemented by us at Xiaomi, we can 
help to find a work around on this. And the sync replication also replies on 
the rename operation to do fencing.

3. The replication related stuffs. I have been asking this from long long ago, 
but no one gives an overall solution. And looking at the code on the RB, we 
have already started to change the stuffs in replication? And for 
RecoveredReplicationSource, we make it abstract and introduce a new 
FSRecoveredReplicationSource? Then where is the FSReplicationSource?

I always say, we should have an overall solution first, i.e., we should know 
what the system looks like when we finish. Then we start to work things out.

Thanks.

> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
> Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost

2018-09-13 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614231#comment-16614231
 ] 

Allan Yang commented on HBASE-21035:


{quote}
So let's start helping on HBCK2?
{quote}
Sure!

> Meta Table should be able to online even if all procedures are lost
> ---
>
> Key: HBASE-21035
> URL: https://issues.apache.org/jira/browse/HBASE-21035
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21035.branch-2.0.001.patch, 
> HBASE-21035.branch-2.1.001.patch
>
>
> After HBASE-20708, we changed the way we init after master starts. It will 
> only check WAL dirs and compare to Zookeeper RS nodes to decide which server 
> need to expire. For servers which's dir is ending with 'SPLITTING', we assure 
> that there will be a SCP for it.
> But, if the server with the meta region crashed before master restarts, and 
> if all the procedure wals are lost (due to bug, or deleted manually, 
> whatever), the new restarted master will be stuck when initing. Since no one 
> will bring meta region online.
> Although it is an anomaly case, but I think no matter what happens, we need 
> to online meta region. Otherwise, we are sitting ducks, noting can be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-09-13 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614222#comment-16614222
 ] 

Ted Yu commented on HBASE-20734:


I wouldn't have big chunk of time to review - working on WAL refactoring.

FYI

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20734.branch-1.001.patch, 
> HBASE-20734.branch-1.002.patch, HBASE-20734.branch-1.003.patch, 
> HBASE-20734.branch-1.004.patch, HBASE-20734.master.001.patch, 
> HBASE-20734.master.002.patch, HBASE-20734.master.003.patch, 
> HBASE-20734.master.004.patch, HBASE-20734.master.005.patch, 
> HBASE-20734.master.006.patch, HBASE-20734.master.007.patch, 
> HBASE-20734.master.008.patch, HBASE-20734.master.009.patch, 
> HBASE-20734.master.010.patch, HBASE-20734.master.011.patch, 
> HBASE-20734.master.012.patch
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-13 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614228#comment-16614228
 ] 

Allan Yang commented on HBASE-21164:


Another concern is that if the master is down for a long time, the regionserver 
will report for duty in a max time of one minute, that will slow down the 
master start up process. Since we need to count enough RS before continue. 

> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.005.patch, HBASE-21164.006.patch, 
> HBASE-21164.branch-2.1.001.patch, HBASE-21164.branch-2.1.002.patch, 
> HBASE-21164.branch-2.1.003.patch, HBASE-21164.branch-2.1.004.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21197) TestServerCrashProcedureWithReplicas fails intermittently

2018-09-13 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HBASE-21197:
--
Description: 
Example failure reports are: 
[https://builds.apache.org/job/PreCommit-HBASE-Build/14396/testReport/] and 
[https://builds.apache.org/job/PreCommit-HBASE-Build/14381/testReport/] 

Failing test methods are:
- {{testRecoveryAndDoubleExecutionOnRsWithMeta}}
- {{testRecoveryAndDoubleExecutionOnRsWithoutMeta}}
- {{testCrashTargetRs}}.

Specially, the exception trace is:
{code:java}
java.lang.AssertionError: Crashed replica regions should not be assigned to 
same region server
at 
org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedureWithReplicas.assertReplicaDistributed(TestServerCrashProcedureWithReplicas.java:68){code}

  was:
Example failure reports are: 
[https://builds.apache.org/job/PreCommit-HBASE-Build/14396/testReport/] and 
[https://builds.apache.org/job/PreCommit-HBASE-Build/14381/testReport/] 

Failing test methods are: {{testRecoveryAndDoubleExecutionOnRsWithMeta}}, 
{{testRecoveryAndDoubleExecutionOnRsWithoutMeta}} and {{testCrashTargetRs}}.

Specially, the exception trace is:
{code:java}
java.lang.AssertionError: Crashed replica regions should not be assigned to 
same region server
at 
org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedureWithReplicas.assertReplicaDistributed(TestServerCrashProcedureWithReplicas.java:68){code}


> TestServerCrashProcedureWithReplicas fails intermittently
> -
>
> Key: HBASE-21197
> URL: https://issues.apache.org/jira/browse/HBASE-21197
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Mingliang Liu
>Priority: Major
>
> Example failure reports are: 
> [https://builds.apache.org/job/PreCommit-HBASE-Build/14396/testReport/] and 
> [https://builds.apache.org/job/PreCommit-HBASE-Build/14381/testReport/] 
> Failing test methods are:
> - {{testRecoveryAndDoubleExecutionOnRsWithMeta}}
> - {{testRecoveryAndDoubleExecutionOnRsWithoutMeta}}
> - {{testCrashTargetRs}}.
> Specially, the exception trace is:
> {code:java}
> java.lang.AssertionError: Crashed replica regions should not be assigned to 
> same region server
>   at 
> org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedureWithReplicas.assertReplicaDistributed(TestServerCrashProcedureWithReplicas.java:68){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21197) TestServerCrashProcedureWithReplicas fails intermittently

2018-09-13 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HBASE-21197:
--
Description: 
Example failure reports are: 
[https://builds.apache.org/job/PreCommit-HBASE-Build/14396/testReport/] and 
[https://builds.apache.org/job/PreCommit-HBASE-Build/14381/testReport/] 

Failing test methods are: {{testRecoveryAndDoubleExecutionOnRsWithMeta}}, 
{{testRecoveryAndDoubleExecutionOnRsWithoutMeta}} and {{testCrashTargetRs}}.

Specially, the exception trace is:
{code:java}
java.lang.AssertionError: Crashed replica regions should not be assigned to 
same region server
at 
org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedureWithReplicas.assertReplicaDistributed(TestServerCrashProcedureWithReplicas.java:68){code}

  was:
An example failure report is like: 
[https://builds.apache.org/job/PreCommit-HBASE-Build/14396/testReport/]

Failing test methods are: {{testRecoveryAndDoubleExecutionOnRsWithMeta}}, 
{{testRecoveryAndDoubleExecutionOnRsWithoutMeta}} and {{testCrashTargetRs}}.

Specially, the exception trace is:
{code:java}
java.lang.AssertionError: Crashed replica regions should not be assigned to 
same region server
at 
org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedureWithReplicas.assertReplicaDistributed(TestServerCrashProcedureWithReplicas.java:68){code}


> TestServerCrashProcedureWithReplicas fails intermittently
> -
>
> Key: HBASE-21197
> URL: https://issues.apache.org/jira/browse/HBASE-21197
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Mingliang Liu
>Priority: Major
>
> Example failure reports are: 
> [https://builds.apache.org/job/PreCommit-HBASE-Build/14396/testReport/] and 
> [https://builds.apache.org/job/PreCommit-HBASE-Build/14381/testReport/] 
> Failing test methods are: {{testRecoveryAndDoubleExecutionOnRsWithMeta}}, 
> {{testRecoveryAndDoubleExecutionOnRsWithoutMeta}} and {{testCrashTargetRs}}.
> Specially, the exception trace is:
> {code:java}
> java.lang.AssertionError: Crashed replica regions should not be assigned to 
> same region server
>   at 
> org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedureWithReplicas.assertReplicaDistributed(TestServerCrashProcedureWithReplicas.java:68){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-09-13 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614219#comment-16614219
 ] 

Zach York commented on HBASE-20734:
---

[~yuzhih...@gmail.com] any further thoughts?

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20734.branch-1.001.patch, 
> HBASE-20734.branch-1.002.patch, HBASE-20734.branch-1.003.patch, 
> HBASE-20734.branch-1.004.patch, HBASE-20734.master.001.patch, 
> HBASE-20734.master.002.patch, HBASE-20734.master.003.patch, 
> HBASE-20734.master.004.patch, HBASE-20734.master.005.patch, 
> HBASE-20734.master.006.patch, HBASE-20734.master.007.patch, 
> HBASE-20734.master.008.patch, HBASE-20734.master.009.patch, 
> HBASE-20734.master.010.patch, HBASE-20734.master.011.patch, 
> HBASE-20734.master.012.patch
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-13 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614210#comment-16614210
 ] 

Mingliang Liu commented on HBASE-21164:
---

{quote}
> There is a facility to wake this.sleeper. Could call from stop/abort?
Can do that. As long as it's after {{this.stopped = true;}}, sleeper should 
respect that.
{quote}
Currently the sleeper is already waken in stop/abort.

v6 changes the test method to remove the dependency of log capture.

> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.005.patch, HBASE-21164.006.patch, 
> HBASE-21164.branch-2.1.001.patch, HBASE-21164.branch-2.1.002.patch, 
> HBASE-21164.branch-2.1.003.patch, HBASE-21164.branch-2.1.004.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-13 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HBASE-21164:
--
Attachment: HBASE-21164.006.patch

> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.005.patch, HBASE-21164.006.patch, 
> HBASE-21164.branch-2.1.001.patch, HBASE-21164.branch-2.1.002.patch, 
> HBASE-21164.branch-2.1.003.patch, HBASE-21164.branch-2.1.004.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21197) TestServerCrashProcedureWithReplicas fails intermittently

2018-09-13 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HBASE-21197:
--
Description: 
An example failure report is like: 
[https://builds.apache.org/job/PreCommit-HBASE-Build/14396/testReport/]

Failing test methods are: {{testRecoveryAndDoubleExecutionOnRsWithMeta}}, 
{{testRecoveryAndDoubleExecutionOnRsWithoutMeta}} and {{testCrashTargetRs}}.

Specially, the exception trace is:
{code:java}
java.lang.AssertionError: Crashed replica regions should not be assigned to 
same region server
at 
org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedureWithReplicas.assertReplicaDistributed(TestServerCrashProcedureWithReplicas.java:68){code}

  was:
An example failure report is like: 
[https://builds.apache.org/job/PreCommit-HBASE-Build/14396/testReport/]

Specially, the exception trace is:
{code:java}
java.lang.AssertionError: Crashed replica regions should not be assigned to 
same region server
at 
org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedureWithReplicas.assertReplicaDistributed(TestServerCrashProcedureWithReplicas.java:68){code}


> TestServerCrashProcedureWithReplicas fails intermittently
> -
>
> Key: HBASE-21197
> URL: https://issues.apache.org/jira/browse/HBASE-21197
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Mingliang Liu
>Priority: Major
>
> An example failure report is like: 
> [https://builds.apache.org/job/PreCommit-HBASE-Build/14396/testReport/]
> Failing test methods are: {{testRecoveryAndDoubleExecutionOnRsWithMeta}}, 
> {{testRecoveryAndDoubleExecutionOnRsWithoutMeta}} and {{testCrashTargetRs}}.
> Specially, the exception trace is:
> {code:java}
> java.lang.AssertionError: Crashed replica regions should not be assigned to 
> same region server
>   at 
> org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedureWithReplicas.assertReplicaDistributed(TestServerCrashProcedureWithReplicas.java:68){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21197) TestServerCrashProcedureWithReplicas fails intermittently

2018-09-13 Thread Mingliang Liu (JIRA)
Mingliang Liu created HBASE-21197:
-

 Summary: TestServerCrashProcedureWithReplicas fails intermittently
 Key: HBASE-21197
 URL: https://issues.apache.org/jira/browse/HBASE-21197
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Mingliang Liu


An example failure report is like: 
[https://builds.apache.org/job/PreCommit-HBASE-Build/14396/testReport/]

Specially, the exception trace is:
{code:java}
java.lang.AssertionError: Crashed replica regions should not be assigned to 
same region server
at 
org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedureWithReplicas.assertReplicaDistributed(TestServerCrashProcedureWithReplicas.java:68){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20993) [Auth] IPC client fallback to simple auth allowed doesn't work

2018-09-13 Thread Jack Bearden (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614198#comment-16614198
 ] 

Jack Bearden commented on HBASE-20993:
--

Hi Reid thanks for checking in. I got pulled away for a family vacation but 
plan to make progress on this over the weekend

> [Auth] IPC client fallback to simple auth allowed doesn't work
> --
>
> Key: HBASE-20993
> URL: https://issues.apache.org/jira/browse/HBASE-20993
> Project: HBase
>  Issue Type: Bug
>  Components: Client, security
>Affects Versions: 1.2.6
>Reporter: Reid Chan
>Assignee: Jack Bearden
>Priority: Critical
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8
>
> Attachments: HBASE-20993.001.patch, 
> HBASE-20993.003.branch-1.flowchart.png, HBASE-20993.branch-1.002.patch, 
> HBASE-20993.branch-1.003.patch, HBASE-20993.branch-1.004.patch, 
> HBASE-20993.branch-1.005.patch, HBASE-20993.branch-1.006.patch, 
> HBASE-20993.branch-1.007.patch, HBASE-20993.branch-1.008.patch, 
> HBASE-20993.branch-1.009.patch, HBASE-20993.branch-1.2.001.patch, 
> HBASE-20993.branch-1.wip.002.patch, HBASE-20993.branch-1.wip.patch, 
> yetus-local-testpatch-output-009.txt
>
>
> It is easily reproducible.
> client's hbase-site.xml: hadoop.security.authentication:kerberos, 
> hbase.security.authentication:kerberos, 
> hbase.ipc.client.fallback-to-simple-auth-allowed:true, keytab and principal 
> are right set
> A simple auth hbase cluster, a kerberized hbase client application. 
> application trying to r/w/c/d table will have following exception:
> {code}
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
>   at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
>   at 
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:617)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$700(RpcClientImpl.java:162)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:743)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:740)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:740)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:906)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:873)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1241)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:58383)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(ConnectionManager.java:1592)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1530)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1552)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1581)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1738)
>   at 
> org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4297)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4289)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsyncV2(HBaseAdmin.java:753)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:674)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:607)
>   at 
> 

[jira] [Commented] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-09-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614186#comment-16614186
 ] 

Hadoop QA commented on HBASE-20306:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
13s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
11s{color} | {color:red} hbase-server: The patch generated 2 new + 8 unchanged 
- 0 fixed = 10 total (was 8) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 7s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 20s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}122m 
33s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}163m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20306 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939616/HBASE-20306.002.patch 
|
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux e6218759bd46 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 5d14c1af65 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14410/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14410/testReport/ |
| Max. process+thread count | 5232 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 

[jira] [Commented] (HBASE-21098) Improve Snapshot Performance with Temporary Snapshot Directory when rootDir on S3

2018-09-13 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614180#comment-16614180
 ] 

Zach York commented on HBASE-21098:
---

Pushed to branch-2 and master. Wrangling a test before pushing it to branch-1.

> Improve Snapshot Performance with Temporary Snapshot Directory when rootDir 
> on S3
> -
>
> Key: HBASE-21098
> URL: https://issues.apache.org/jira/browse/HBASE-21098
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 1.4.8, 2.1.1
>Reporter: Tyler Mi
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21098.master.001.patch, 
> HBASE-21098.master.002.patch, HBASE-21098.master.003.patch, 
> HBASE-21098.master.004.patch, HBASE-21098.master.005.patch, 
> HBASE-21098.master.006.patch, HBASE-21098.master.007.patch, 
> HBASE-21098.master.008.patch, HBASE-21098.master.009.patch, 
> HBASE-21098.master.010.patch, HBASE-21098.master.011.patch, 
> HBASE-21098.master.012.patch, HBASE-21098.master.013.patch
>
>
> When using Apache HBase, the snapshot feature can be used to make a point in 
> time recovery. To do this, HBase creates a manifest of all the files in all 
> of the Regions so that those files can be referenced again when a user 
> restores a snapshot. With HBase's S3 storage mode, developers can store their 
> data off-cluster on Amazon S3. However, utilizing S3 as a file system is 
> inefficient in some operations, namely renames. Most Hadoop ecosystem 
> applications use an atomic rename as a method of committing data. However, 
> with S3, a rename is a separate copy and then a delete of every file which is 
> no longer atomic and, in fact, quite costly. In addition, puts and deletes on 
> S3 have latency issues that traditional filesystems do not encounter when 
> manipulating the region snapshots to consolidate into a single manifest. When 
> HBase on S3 users have a significant amount of regions, puts, deletes, and 
> renames (the final commit stage of the snapshot) become the bottleneck 
> causing snapshots to take many minutes or even hours to complete.
> The purpose of this patch is to increase the overall performance of snapshots 
> while utilizing HBase on S3 through the use of a temporary directory for the 
> snapshots that exists on a traditional filesystem like HDFS to circumvent the 
> bottlenecks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21098) Improve Snapshot Performance with Temporary Snapshot Directory when rootDir on S3

2018-09-13 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-21098:
--
Fix Version/s: 2.2.0
   3.0.0

> Improve Snapshot Performance with Temporary Snapshot Directory when rootDir 
> on S3
> -
>
> Key: HBASE-21098
> URL: https://issues.apache.org/jira/browse/HBASE-21098
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 1.4.8, 2.1.1
>Reporter: Tyler Mi
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21098.master.001.patch, 
> HBASE-21098.master.002.patch, HBASE-21098.master.003.patch, 
> HBASE-21098.master.004.patch, HBASE-21098.master.005.patch, 
> HBASE-21098.master.006.patch, HBASE-21098.master.007.patch, 
> HBASE-21098.master.008.patch, HBASE-21098.master.009.patch, 
> HBASE-21098.master.010.patch, HBASE-21098.master.011.patch, 
> HBASE-21098.master.012.patch, HBASE-21098.master.013.patch
>
>
> When using Apache HBase, the snapshot feature can be used to make a point in 
> time recovery. To do this, HBase creates a manifest of all the files in all 
> of the Regions so that those files can be referenced again when a user 
> restores a snapshot. With HBase's S3 storage mode, developers can store their 
> data off-cluster on Amazon S3. However, utilizing S3 as a file system is 
> inefficient in some operations, namely renames. Most Hadoop ecosystem 
> applications use an atomic rename as a method of committing data. However, 
> with S3, a rename is a separate copy and then a delete of every file which is 
> no longer atomic and, in fact, quite costly. In addition, puts and deletes on 
> S3 have latency issues that traditional filesystems do not encounter when 
> manipulating the region snapshots to consolidate into a single manifest. When 
> HBase on S3 users have a significant amount of regions, puts, deletes, and 
> renames (the final commit stage of the snapshot) become the bottleneck 
> causing snapshots to take many minutes or even hours to complete.
> The purpose of this patch is to increase the overall performance of snapshots 
> while utilizing HBase on S3 through the use of a temporary directory for the 
> snapshots that exists on a traditional filesystem like HDFS to circumvent the 
> bottlenecks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-09-13 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614121#comment-16614121
 ] 

Zach York commented on HBASE-20734:
---

Oh weird, I didn't notice the indenting... I'll remove this.

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20734.branch-1.001.patch, 
> HBASE-20734.branch-1.002.patch, HBASE-20734.branch-1.003.patch, 
> HBASE-20734.branch-1.004.patch, HBASE-20734.master.001.patch, 
> HBASE-20734.master.002.patch, HBASE-20734.master.003.patch, 
> HBASE-20734.master.004.patch, HBASE-20734.master.005.patch, 
> HBASE-20734.master.006.patch, HBASE-20734.master.007.patch, 
> HBASE-20734.master.008.patch, HBASE-20734.master.009.patch, 
> HBASE-20734.master.010.patch, HBASE-20734.master.011.patch, 
> HBASE-20734.master.012.patch
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-09-13 Thread Colin Garcia (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614096#comment-16614096
 ] 

Colin Garcia commented on HBASE-20306:
--

Thanks for the comments Andrew. I've updated using the getHistogramReport 
functionality. The only concern I have is that the values won't exactly match 
the "Overall Status" due to some casting to long from doubles. Here is a 
snippet of the output. Let me know what you think :) hoping this is a step in 
the right direction

 
{code:java}
2018-09-13 14:30:52,732 INFO  
[MultiThreadedAction-ProgressReporter-1536874207705] util.MultiThreadedAction: 
[W:20] Keys=82495, cols=927.8 K, time=00:00:45 Overall: [keys/s= 1832, 
latency=10.82 ms] Current: [keys/s=2058, latency=9.64 ms], wroteUpTo=-1 
2018-09-13 14:30:57,733 INFO  
[MultiThreadedAction-ProgressReporter-1536874207705] util.MultiThreadedAction: 
[W:20] Keys=92516, cols=1.0 M, time=00:00:50 Overall: [keys/s= 1849, 
latency=10.72 ms] Current: [keys/s=2004, latency=9.89 ms], wroteUpTo=-1 
2018-09-13 14:31:02,738 INFO  
[MultiThreadedAction-ProgressReporter-1536874207705] util.MultiThreadedAction: 
RUN SUMMARY 
KEYS PER SECOND: 
mean=1849.90, min=1108.00, max=2065.00, stdDev=291.71, 50th=1945.50, 
75th=2017.50, 95th=2065.00, 99th=2065.00, 99.9th=2065.00, 99.99th=2065.00, 
99.999th=2065.00
LATENCY: 
mean=10.60, min=9.00, max=17.00, stdDev=2.41, 50th=10.00, 75th=10.50, 
95th=17.00, 99th=17.00, 99.9th=17.00, 99.99th=17.00, 99.999th=17.00
{code}
 

 

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306.000.patch, HBASE-20306.001.patch, 
> HBASE-20306.002.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-09-13 Thread Colin Garcia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Garcia updated HBASE-20306:
-
Attachment: HBASE-20306.002.patch

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306.000.patch, HBASE-20306.001.patch, 
> HBASE-20306.002.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21188) Print heap and gc informations in our junit ResourceChecker

2018-09-13 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614076#comment-16614076
 ] 

Mike Drob commented on HBASE-21188:
---

What do "GCCount LEAK?" and "UsedHeapMemoryMB LEAK?" mean in this context?

Is that why you were suggesting not to track GC as a resource? Total heap usage 
detecting a leak also seems unlikely, since we probably are building up lots of 
structures during the tests that maybe we aren't cleaning up, but also maybe 
don't need to.

> Print heap and gc informations in our junit ResourceChecker
> ---
>
> Key: HBASE-21188
> URL: https://issues.apache.org/jira/browse/HBASE-21188
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21188.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation

2018-09-13 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614065#comment-16614065
 ] 

Nihal Jain edited comment on HBASE-21196 at 9/13/18 9:12 PM:
-

Alternately we can modify 
[isMetaClearingException|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/ClientExceptionsUtil.java#L54]
 to return false in case original exception is null. In fact we return false in 
case of 
[isConnectionException|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/ClientExceptionsUtil.java#L142]
 while we return true in case of 
[isMetaClearingException|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/ClientExceptionsUtil.java#L54].
 Which shows we have an inconsistency here itself. 


was (Author: nihaljain.cs):
Alternately we can modify 
[isMetaClearingException|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/ClientExceptionsUtil.java#L54]
 to return false in case original exception is null. In fact we return false in 
case of 
[isConnectionException|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/ClientExceptionsUtil.java#L142]
 while we return false in case of 
[isMetaClearingException|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/ClientExceptionsUtil.java#L54].
 Which shows we have an inconsistency here itself. 

> HTableMultiplexer clears the meta cache after every put operation
> -
>
> Key: HBASE-21196
> URL: https://issues.apache.org/jira/browse/HBASE-21196
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 3.0.0, 1.3.3, 2.2.0
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HBASE-21196.master.001.patch, 
> HTableMultiplexer1000Puts.UT.txt
>
>
> *Problem:* Operations which use 
> {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, 
> MultiResponse, int)}} API with tablename set to null reset the meta cache of 
> the corresponding server after each call. One such operation is put operation 
> of HTableMultiplexer (Might not be the only one). This may impact the 
> performance of the system severely as all new ops directed to that server 
> will have to go to zk first to get the meta table address and then get the 
> location of the table region as it will become empty after every 
> htablemultiplexer put.
> From the logs below, one can see after every other put the cached region 
> locations are cleared. As a side effect of this, before every put the server 
> needs to contact zk and get meta table location and read meta to get region 
> locations of the table.
> {noformat}
> 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): 
> Removed all cached region locations that map to 
> root1-thinkpad-t440p,35811,1536857446588
> 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] 
> client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for 
> root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] 
> ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" 
> request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 executing as root1
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 param: region= 
> testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., 
> row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { 
> associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 
> totalTime: 0
> 2018-09-13 22:21:15,516 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, 
> count=0, allocations=1
> 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, 
> callTime: 2ms
> 2018-09-13 22:21:15,516 

[jira] [Commented] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation

2018-09-13 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614065#comment-16614065
 ] 

Nihal Jain commented on HBASE-21196:


Alternately we can modify 
[isMetaClearingException|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/ClientExceptionsUtil.java#L54]
 to return false in case original exception is null. In fact we return false in 
case of 
[isConnectionException|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/ClientExceptionsUtil.java#L142]
 while we return false in case of 
[isMetaClearingException|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/ClientExceptionsUtil.java#L54].
 Which shows we have an inconsistency here itself. 

> HTableMultiplexer clears the meta cache after every put operation
> -
>
> Key: HBASE-21196
> URL: https://issues.apache.org/jira/browse/HBASE-21196
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 3.0.0, 1.3.3, 2.2.0
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HBASE-21196.master.001.patch, 
> HTableMultiplexer1000Puts.UT.txt
>
>
> *Problem:* Operations which use 
> {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, 
> MultiResponse, int)}} API with tablename set to null reset the meta cache of 
> the corresponding server after each call. One such operation is put operation 
> of HTableMultiplexer (Might not be the only one). This may impact the 
> performance of the system severely as all new ops directed to that server 
> will have to go to zk first to get the meta table address and then get the 
> location of the table region as it will become empty after every 
> htablemultiplexer put.
> From the logs below, one can see after every other put the cached region 
> locations are cleared. As a side effect of this, before every put the server 
> needs to contact zk and get meta table location and read meta to get region 
> locations of the table.
> {noformat}
> 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): 
> Removed all cached region locations that map to 
> root1-thinkpad-t440p,35811,1536857446588
> 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] 
> client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for 
> root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] 
> ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" 
> request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 executing as root1
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 param: region= 
> testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., 
> row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { 
> associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 
> totalTime: 0
> 2018-09-13 22:21:15,516 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, 
> count=0, allocations=1
> 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, 
> callTime: 2ms
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan 
> table=hbase:meta, 
> startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): 
> Advancing internal small scanner to startKey at 
> 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99'
> 2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up 
> meta region location in ZK, 
> connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
> {noformat}
> From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see 
> that the string "Removed all cached region locations that map" and "Looking 
> up meta region location in ZK" are present for every put.
> *Analysis:*
>  The problem occurs as we call the {{cleanServerCache}} method always clears 
> the server cache in case tablename is 

[jira] [Updated] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation

2018-09-13 Thread Nihal Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-21196:
---
Description: 
*Problem:* Operations which use 
{{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, 
MultiResponse, int)}} API with tablename set to null reset the meta cache of 
the corresponding server after each call. One such operation is put operation 
of HTableMultiplexer (Might not be the only one). This may impact the 
performance of the system severely as all new ops directed to that server will 
have to go to zk first to get the meta table address and then get the location 
of the table region as it will become empty after every htablemultiplexer put.

>From the logs below, one can see after every other put the cached region 
>locations are cleared. As a side effect of this, before every put the server 
>needs to contact zk and get meta table location and read meta to get region 
>locations of the table.
{noformat}
2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): Removed 
all cached region locations that map to root1-thinkpad-t440p,35811,1536857446588
2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] 
client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for 
root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
2018-09-13 22:21:15,515 TRACE 
[RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] 
ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" 
request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes
2018-09-13 22:21:15,515 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.CallRunner(105): 
callId: 218 service: ClientService methodName: Get size: 137 connection: 
127.0.0.1:42338 executing as root1
2018-09-13 22:21:15,515 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.RpcServer(2356): 
callId: 218 service: ClientService methodName: Get size: 137 connection: 
127.0.0.1:42338 param: region= 
testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., 
row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { 
associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 
totalTime: 0
2018-09-13 22:21:15,516 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, count=0, 
allocations=1
2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, 
callTime: 2ms
2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan 
table=hbase:meta, 
startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99
2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): 
Advancing internal small scanner to startKey at 
'testHTableMultiplexer_1,\x00\x00\x00\xC5,99'
2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up 
meta region location in ZK, 
connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
{noformat}

>From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see that 
>the string "Removed all cached region locations that map" and "Looking up meta 
>region location in ZK" are present for every put.

*Analysis:*
 The problem occurs as we call the {{cleanServerCache}} method always clears 
the server cache in case tablename is null and exception is null. See 
[AsyncRequestFutureImpl.java#L918|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncRequestFutureImpl.java#L918]

{code:java}
private void cleanServerCache(ServerName server, Throwable regionException) {
if (tableName == null && 
ClientExceptionsUtil.isMetaClearingException(regionException)) {
  // For multi-actions, we don't have a table name, but we want to make 
sure to clear the
  // cache in case there were location-related exceptions. We don't to 
clear the cache
  // for every possible exception that comes through, however.
  asyncProcess.connection.clearCaches(server);
}
  }
{code}

The problem is  
[ClientExceptionsUtil.isMetaClearingException(regionException))|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/ClientExceptionsUtil.java#L51]
 assumes that the caller should take care of null exception check before 
calling the method i.e. it will return true if the passed exception is null, 
which may not be a correct assumption.

  was:
*Problem:* Operations which use 
{{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, 
MultiResponse, int)}} API with tablename set to null reset the meta cache of 
the corresponding server after each call. One such operation is put operation 
of HTableMultiplexer (Might not be the only one). This may impact the 

[jira] [Updated] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation

2018-09-13 Thread Nihal Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-21196:
---
Description: 
*Problem:* Operations which use 
{{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, 
MultiResponse, int)}} API with tablename set to null reset the meta cache of 
the corresponding server after each call. One such operation is put operation 
of HTableMultiplexer (Might not be the only one). This may impact the 
performance of the system severely as all new ops directed to that server will 
have to go to zk first to get the meta table address and then get the location 
of the table region as it will become empty after every htablemultiplexer put.

>From the logs below, one can see after every other put the cached region 
>locations are cleared. As a side effect of this, before every put the server 
>needs to contact zk and get meta table location and read meta to get region 
>locations of the table.
{noformat}
2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): Removed 
all cached region locations that map to root1-thinkpad-t440p,35811,1536857446588
2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] 
client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for 
root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
2018-09-13 22:21:15,515 TRACE 
[RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] 
ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" 
request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes
2018-09-13 22:21:15,515 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.CallRunner(105): 
callId: 218 service: ClientService methodName: Get size: 137 connection: 
127.0.0.1:42338 executing as root1
2018-09-13 22:21:15,515 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.RpcServer(2356): 
callId: 218 service: ClientService methodName: Get size: 137 connection: 
127.0.0.1:42338 param: region= 
testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., 
row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { 
associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 
totalTime: 0
2018-09-13 22:21:15,516 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, count=0, 
allocations=1
2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, 
callTime: 2ms
2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan 
table=hbase:meta, 
startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99
2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): 
Advancing internal small scanner to startKey at 
'testHTableMultiplexer_1,\x00\x00\x00\xC5,99'
2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up 
meta region location in ZK, 
connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
{noformat}

>From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see that 
>the string "Removed all cached region locations that map" and "Looking up meta 
>region location in ZK" are present for every put.

*Analysis:*
 The problem occurs as we call the {{cleanServerCache}} method always clears 
the server cache in case tablename is null and exception is null. See 
[AsyncRequestFutureImpl.java#L918|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncRequestFutureImpl.java#L918]

{code:java}
private void cleanServerCache(ServerName server, Throwable regionException) {
if (tableName == null && 
ClientExceptionsUtil.isMetaClearingException(regionException)) {
  // For multi-actions, we don't have a table name, but we want to make 
sure to clear the
  // cache in case there were location-related exceptions. We don't to 
clear the cache
  // for every possible exception that comes through, however.
  asyncProcess.connection.clearCaches(server);
}
  }
{code}

The problem is  ClientExceptionsUtil.isMetaClearingException(regionException)) 
assumes that the caller should take care of null exception check before calling 
the method i.e. it will return true if the passed exception is null, which may 
not be a correct assumption.

  was:
*Problem:* Operations which use 
{{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, 
MultiResponse, int)}} API with tablename set to null reset the meta cache of 
the corresponding server after each call. One such operation is put operation 
of HTableMultiplexer (Might not be the only one). This may impact the 
performance of the system severely as all new ops directed to that server will 
have to go to zk first to get the meta table address and then get the location 
of the table 

[jira] [Comment Edited] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation

2018-09-13 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614060#comment-16614060
 ] 

Nihal Jain edited comment on HBASE-21196 at 9/13/18 9:02 PM:
-

Attaching a patch  [^HBASE-21196.master.001.patch]  with a UT to expose the 
problem and one of the possible fix i.e. caller validates the exception is not 
null before calling clear cache method. We need not do this null check in case 
of {{AsyncRequestFutureImpl.receiveGlobalFailure}} as this method is called 
only after an exception is caught.


was (Author: nihaljain.cs):
Attaching a patch  [^HBASE-21196.master.001.patch]  with a UT to expose the 
problem and one of the possible fix i.e. called validates the exception is not 
null before calling clear cache method.

> HTableMultiplexer clears the meta cache after every put operation
> -
>
> Key: HBASE-21196
> URL: https://issues.apache.org/jira/browse/HBASE-21196
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 3.0.0, 1.3.3, 2.2.0
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HBASE-21196.master.001.patch, 
> HTableMultiplexer1000Puts.UT.txt
>
>
> *Problem:* Operations which use 
> {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, 
> MultiResponse, int)}} API with tablename set to null reset the meta cache of 
> the corresponding server after each call. One such operation is put operation 
> of HTableMultiplexer (Might not be the only one). This may impact the 
> performance of the system severely as all new ops directed to that server 
> will have to go to zk first to get the meta table address and then get the 
> location of the table region as it will become empty after every 
> htablemultiplexer put.
> From the logs below, one can see after every other put the cached region 
> locations are cleared. As a side effect of this, before every put the server 
> needs to contact zk and get meta table location and read meta to get region 
> locations of the table.
> {noformat}
> 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): 
> Removed all cached region locations that map to 
> root1-thinkpad-t440p,35811,1536857446588
> 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] 
> client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for 
> root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] 
> ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" 
> request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 executing as root1
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 param: region= 
> testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., 
> row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { 
> associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 
> totalTime: 0
> 2018-09-13 22:21:15,516 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, 
> count=0, allocations=1
> 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, 
> callTime: 2ms
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan 
> table=hbase:meta, 
> startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): 
> Advancing internal small scanner to startKey at 
> 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99'
> 2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up 
> meta region location in ZK, 
> connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
> {noformat}
> From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see 
> that the string "Removed all cached region locations that map" and "Looking 
> up meta region location in ZK" are present 800+ times for 1000 back to back 
> puts.
> *Analysis:*
>  The problem occurs as we call the {{cleanServerCache}} method always clears 
> the server cache in case tablename is null and exception is null. See 
> 

[jira] [Updated] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation

2018-09-13 Thread Nihal Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-21196:
---
Fix Version/s: 3.0.0
   Attachment: HBASE-21196.master.001.patch
   Status: Patch Available  (was: Open)

Attaching a patch  [^HBASE-21196.master.001.patch]  with a UT to expose the 
problem and one of the possible fix i.e. called validates the exception is not 
null before calling clear cache method.

> HTableMultiplexer clears the meta cache after every put operation
> -
>
> Key: HBASE-21196
> URL: https://issues.apache.org/jira/browse/HBASE-21196
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 3.0.0, 1.3.3, 2.2.0
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HBASE-21196.master.001.patch, 
> HTableMultiplexer1000Puts.UT.txt
>
>
> *Problem:* Operations which use 
> {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, 
> MultiResponse, int)}} API with tablename set to null reset the meta cache of 
> the corresponding server after each call. One such operation is put operation 
> of HTableMultiplexer (Might not be the only one). This may impact the 
> performance of the system severely as all new ops directed to that server 
> will have to go to zk first to get the meta table address and then get the 
> location of the table region as it will become empty after every 
> htablemultiplexer put.
> From the logs below, one can see after every other put the cached region 
> locations are cleared. As a side effect of this, before every put the server 
> needs to contact zk and get meta table location and read meta to get region 
> locations of the table.
> {noformat}
> 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): 
> Removed all cached region locations that map to 
> root1-thinkpad-t440p,35811,1536857446588
> 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] 
> client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for 
> root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] 
> ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" 
> request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 executing as root1
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 param: region= 
> testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., 
> row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { 
> associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 
> totalTime: 0
> 2018-09-13 22:21:15,516 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, 
> count=0, allocations=1
> 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, 
> callTime: 2ms
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan 
> table=hbase:meta, 
> startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): 
> Advancing internal small scanner to startKey at 
> 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99'
> 2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up 
> meta region location in ZK, 
> connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
> {noformat}
> From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see 
> that the string "Removed all cached region locations that map" and "Looking 
> up meta region location in ZK" are present 800+ times for 1000 back to back 
> puts.
> *Analysis:*
>  The problem occurs as we call the {{cleanServerCache}} method always clears 
> the server cache in case tablename is null and exception is null. See 
> [AsyncRequestFutureImpl.java#L918|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncRequestFutureImpl.java#L918]
> {code:java}
> private void cleanServerCache(ServerName server, Throwable regionException) {
> if (tableName == null && 
> ClientExceptionsUtil.isMetaClearingException(regionException)) {
>   // For multi-actions, we don't have a table 

[jira] [Updated] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation

2018-09-13 Thread Nihal Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-21196:
---
Description: 
*Problem:* Operations which use 
{{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, 
MultiResponse, int)}} API with tablename set to null reset the meta cache of 
the corresponding server after each call. One such operation is put operation 
of HTableMultiplexer (Might not be the only one). This may impact the 
performance of the system severely as all new ops directed to that server will 
have to go to zk first to get the meta table address and then get the location 
of the table region as it will become empty after every htablemultiplexer put.

>From the logs below, one can see after every other put the cached region 
>locations are cleared. As a side effect of this, before every put the server 
>needs to contact zk and get meta table location and read meta to get region 
>locations of the table.
{noformat}
2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): Removed 
all cached region locations that map to root1-thinkpad-t440p,35811,1536857446588
2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] 
client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for 
root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
2018-09-13 22:21:15,515 TRACE 
[RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] 
ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" 
request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes
2018-09-13 22:21:15,515 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.CallRunner(105): 
callId: 218 service: ClientService methodName: Get size: 137 connection: 
127.0.0.1:42338 executing as root1
2018-09-13 22:21:15,515 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.RpcServer(2356): 
callId: 218 service: ClientService methodName: Get size: 137 connection: 
127.0.0.1:42338 param: region= 
testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., 
row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { 
associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 
totalTime: 0
2018-09-13 22:21:15,516 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, count=0, 
allocations=1
2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, 
callTime: 2ms
2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan 
table=hbase:meta, 
startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99
2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): 
Advancing internal small scanner to startKey at 
'testHTableMultiplexer_1,\x00\x00\x00\xC5,99'
2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up 
meta region location in ZK, 
connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
{noformat}
>From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see that 
>the string "Removed all cached region locations that map" and "Looking up meta 
>region location in ZK" are present 800+ times for 1000 back to back puts.

*Analysis:*
 The problem occurs as we call the {{cleanServerCache}} method always clears 
the server cache in case tablename is null and exception is null. See 
[AsyncRequestFutureImpl.java#L918|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncRequestFutureImpl.java#L918]

{code:java}
private void cleanServerCache(ServerName server, Throwable regionException) {
if (tableName == null && 
ClientExceptionsUtil.isMetaClearingException(regionException)) {
  // For multi-actions, we don't have a table name, but we want to make 
sure to clear the
  // cache in case there were location-related exceptions. We don't to 
clear the cache
  // for every possible exception that comes through, however.
  asyncProcess.connection.clearCaches(server);
}
  }
{code}

The problem is  ClientExceptionsUtil.isMetaClearingException(regionException)) 
assumes that the caller should take care of null exception check before calling 
the method i.e. it will return true if the passed exception is null, which may 
not be a correct assumption.

  was:
Operations which use {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, 
ServerName, MultiResponse, int)}} API with tablename set to null reset the meta 
cache of the corresponding server after each call. One such operation is put 
operation of HTableMultiplexer (Might not be the only one). This may impact the 
performance of the system severely as all new ops directed to that server will 
have to go to zk first to get the meta table address and then get the location 
of 

[jira] [Commented] (HBASE-21195) Support Log storage similar to FB LogDevice

2018-09-13 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614053#comment-16614053
 ] 

Mike Drob commented on HBASE-21195:
---

I was thinking about this earlier, it would be a great addition.

> Support Log storage similar to FB LogDevice
> ---
>
> Key: HBASE-21195
> URL: https://issues.apache.org/jira/browse/HBASE-21195
> Project: HBase
>  Issue Type: New Feature
>Reporter: jagan
>Priority: Major
>
> Log storage, which is write once and sequential data, can be optimized in the 
> following ways,
> 1. Key generated should be incremental.
> 2. HFile key index can be range and need not use BloomFilter 
> 3. Instead of compaction, periodic delete of old files based on TTL can be 
> supported



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation

2018-09-13 Thread Nihal Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-21196:
---
Description: 
Operations which use {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, 
ServerName, MultiResponse, int)}} API with tablename set to null reset the meta 
cache of the corresponding server after each call. One such operation is put 
operation of HTableMultiplexer (Might not be the only one). This may impact the 
performance of the system severely as all new ops directed to that server will 
have to go to zk first to get the meta table address and then get the location 
of the table region as it will become empty after every htablemultiplexer put.

>From the logs below, one can see after every other put the cached region 
>locations are cleared. As a side effect of this, before every put the server 
>needs to contact zk and get meta table location and read meta to get region 
>locations of the table.

{noformat}
2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): Removed 
all cached region locations that map to root1-thinkpad-t440p,35811,1536857446588
2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] 
client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for 
root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
2018-09-13 22:21:15,515 TRACE 
[RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] 
ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" 
request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes
2018-09-13 22:21:15,515 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.CallRunner(105): 
callId: 218 service: ClientService methodName: Get size: 137 connection: 
127.0.0.1:42338 executing as root1
2018-09-13 22:21:15,515 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.RpcServer(2356): 
callId: 218 service: ClientService methodName: Get size: 137 connection: 
127.0.0.1:42338 param: region= 
testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., 
row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { 
associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 
totalTime: 0
2018-09-13 22:21:15,516 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, count=0, 
allocations=1
2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, 
callTime: 2ms
2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan 
table=hbase:meta, 
startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99
2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): 
Advancing internal small scanner to startKey at 
'testHTableMultiplexer_1,\x00\x00\x00\xC5,99'
2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up 
meta region location in ZK, 
connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
{noformat}

>From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt]  one can see 
>that the string "Removed all cached region locations that map" and "Looking up 
>meta region location in ZK" are present 800+ times for 1000 back to back puts.

  was:
Operations which use {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, 
ServerName, MultiResponse, int)}} API with tablename set to null reset the meta 
cache of the corresponding server after each call. One such operation is put 
operation of HTableMultiplexer (Might not be the only one). This may impact the 
performance of the system severely as all new ops directed to that server will 
have to go to zk first to get the meta table address and then get the location 
of the table region as it will become empty after every htablemultiplexer put.

>From the logs below, one can see after every other put the cached region 
>locations are cleared. As a side effect of this, before every put the server 
>needs to contact zk and get meta table location and read meta to get region 
>locations of the table.

{noformat}
2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): Removed 
all cached region locations that map to root1-thinkpad-t440p,35811,1536857446588
2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] 
client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for 
root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
2018-09-13 22:21:15,515 TRACE 
[RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] 
ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" 
request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes
2018-09-13 22:21:15,515 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.CallRunner(105): 
callId: 218 service: ClientService methodName: Get size: 137 connection: 
127.0.0.1:42338 

[jira] [Updated] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation

2018-09-13 Thread Nihal Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-21196:
---
Attachment: HTableMultiplexer1000Puts.UT.txt

> HTableMultiplexer clears the meta cache after every put operation
> -
>
> Key: HBASE-21196
> URL: https://issues.apache.org/jira/browse/HBASE-21196
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 3.0.0, 1.3.3, 2.2.0
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Critical
> Attachments: HTableMultiplexer1000Puts.UT.txt
>
>
> Operations which use {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, 
> ServerName, MultiResponse, int)}} API with tablename set to null reset the 
> meta cache of the corresponding server after each call. One such operation is 
> put operation of HTableMultiplexer (Might not be the only one). This may 
> impact the performance of the system severely as all new ops directed to that 
> server will have to go to zk first to get the meta table address and then get 
> the location of the table region as it will become empty after every 
> htablemultiplexer put.
> From the logs below, one can see after every other put the cached region 
> locations are cleared. As a side effect of this, before every put the server 
> needs to contact zk and get meta table location and read meta to get region 
> locations of the table.
> {noformat}
> 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): 
> Removed all cached region locations that map to 
> root1-thinkpad-t440p,35811,1536857446588
> 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] 
> client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for 
> root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] 
> ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" 
> request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 executing as root1
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 param: region= 
> testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., 
> row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { 
> associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 
> totalTime: 0
> 2018-09-13 22:21:15,516 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, 
> count=0, allocations=1
> 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, 
> callTime: 2ms
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan 
> table=hbase:meta, 
> startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): 
> Advancing internal small scanner to startKey at 
> 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99'
> 2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up 
> meta region location in ZK, 
> connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation

2018-09-13 Thread Nihal Jain (JIRA)
Nihal Jain created HBASE-21196:
--

 Summary: HTableMultiplexer clears the meta cache after every put 
operation
 Key: HBASE-21196
 URL: https://issues.apache.org/jira/browse/HBASE-21196
 Project: HBase
  Issue Type: Bug
  Components: Performance
Affects Versions: 3.0.0, 1.3.3, 2.2.0
Reporter: Nihal Jain
Assignee: Nihal Jain
 Attachments: HTableMultiplexer1000Puts.UT.txt

Operations which use {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, 
ServerName, MultiResponse, int)}} API with tablename set to null reset the meta 
cache of the corresponding server after each call. One such operation is put 
operation of HTableMultiplexer (Might not be the only one). This may impact the 
performance of the system severely as all new ops directed to that server will 
have to go to zk first to get the meta table address and then get the location 
of the table region as it will become empty after every htablemultiplexer put.

>From the logs below, one can see after every other put the cached region 
>locations are cleared. As a side effect of this, before every put the server 
>needs to contact zk and get meta table location and read meta to get region 
>locations of the table.

{noformat}
2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): Removed 
all cached region locations that map to root1-thinkpad-t440p,35811,1536857446588
2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] 
client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for 
root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
2018-09-13 22:21:15,515 TRACE 
[RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] 
ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" 
request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes
2018-09-13 22:21:15,515 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.CallRunner(105): 
callId: 218 service: ClientService methodName: Get size: 137 connection: 
127.0.0.1:42338 executing as root1
2018-09-13 22:21:15,515 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.RpcServer(2356): 
callId: 218 service: ClientService methodName: Get size: 137 connection: 
127.0.0.1:42338 param: region= 
testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., 
row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { 
associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 
totalTime: 0
2018-09-13 22:21:15,516 TRACE 
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, count=0, 
allocations=1
2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, 
callTime: 2ms
2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan 
table=hbase:meta, 
startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99
2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): 
Advancing internal small scanner to startKey at 
'testHTableMultiplexer_1,\x00\x00\x00\xC5,99'
2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up 
meta region location in ZK, 
connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21177) Add per-table metrics on getTime,putTime and scanTime

2018-09-13 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613964#comment-16613964
 ] 

Andrew Purtell edited comment on HBASE-21177 at 9/13/18 7:37 PM:
-

This doesn't seem quite right:
{code}
 this.hashCode = this.tableName.hashCode();
+   getHis = registry.newHistogram("Namespace_default_table_" + 
tblName + "_metric_" + GET_REQUEST_TIME,
+   GET_REQUEST_TIME_DESC);
+   putHis = registry.newHistogram("Namespace_default_table_" + 
tblName + "_metric_" + PUT_REQUEST_TIME,
+   PUT_REQUEST_TIME_DESC);
+   scanHis = registry.newHistogram("Namespace_default_table_" + 
tblName + "_metric_" + SCAN_REQUEST_TIME,
+   SCAN_REQUEST_TIME_DESC);
{code}
Shouldn't this code register the metrics using {{tableNamePrefix}} instead of 
{{"Namespace_default_table_"}}?

Also, use EnvironmentEdge#getCurrentTime instead of System#currentTimeMillis



was (Author: apurtell):
This doesn't seem quite right:
{code}
 this.hashCode = this.tableName.hashCode();
+   getHis = registry.newHistogram("Namespace_default_table_" + 
tblName + "_metric_" + GET_REQUEST_TIME,
+   GET_REQUEST_TIME_DESC);
+   putHis = registry.newHistogram("Namespace_default_table_" + 
tblName + "_metric_" + PUT_REQUEST_TIME,
+   PUT_REQUEST_TIME_DESC);
+   scanHis = registry.newHistogram("Namespace_default_table_" + 
tblName + "_metric_" + SCAN_REQUEST_TIME,
+   SCAN_REQUEST_TIME_DESC);
{code}
Shouldn't this code register the metrics using {{tableNamePrefix}} instead of 
{{"Namespace_default_table_"}}?

> Add per-table metrics on getTime,putTime and scanTime
> -
>
> Key: HBASE-21177
> URL: https://issues.apache.org/jira/browse/HBASE-21177
> Project: HBase
>  Issue Type: Task
>  Components: metrics
>Affects Versions: 2.0.2
>Reporter: xijiawen
>Priority: Major
> Fix For: HBASE-14850
>
> Attachments: HBASE-21177.patch
>
>
> Adds getTime,putTime,scanTime to the per-table mertrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21177) Add per-table metrics on getTime,putTime and scanTime

2018-09-13 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613964#comment-16613964
 ] 

Andrew Purtell edited comment on HBASE-21177 at 9/13/18 7:36 PM:
-

This doesn't seem quite right:
{code}
 this.hashCode = this.tableName.hashCode();
+   getHis = registry.newHistogram("Namespace_default_table_" + 
tblName + "_metric_" + GET_REQUEST_TIME,
+   GET_REQUEST_TIME_DESC);
+   putHis = registry.newHistogram("Namespace_default_table_" + 
tblName + "_metric_" + PUT_REQUEST_TIME,
+   PUT_REQUEST_TIME_DESC);
+   scanHis = registry.newHistogram("Namespace_default_table_" + 
tblName + "_metric_" + SCAN_REQUEST_TIME,
+   SCAN_REQUEST_TIME_DESC);
{code}
Shouldn't this code register the metrics using {{tableNamePrefix}} instead of 
{{"Namespace_default_table_"}}?


was (Author: apurtell):
This doesn't seem quite right:
{code}
 this.hashCode = this.tableName.hashCode();
+   getHis = registry.newHistogram("Namespace_default_table_" + 
tblName + "_metric_" + GET_REQUEST_TIME,
+   GET_REQUEST_TIME_DESC);
+   putHis = registry.newHistogram("Namespace_default_table_" + 
tblName + "_metric_" + PUT_REQUEST_TIME,
+   PUT_REQUEST_TIME_DESC);
+   scanHis = registry.newHistogram("Namespace_default_table_" + 
tblName + "_metric_" + SCAN_REQUEST_TIME,
+   SCAN_REQUEST_TIME_DESC);
{code}
Shouldn't this code register the metrics using {{tableNamePrefix}} instead of 
{{"Namespace_default_table_" }}?

> Add per-table metrics on getTime,putTime and scanTime
> -
>
> Key: HBASE-21177
> URL: https://issues.apache.org/jira/browse/HBASE-21177
> Project: HBase
>  Issue Type: Task
>  Components: metrics
>Affects Versions: 2.0.2
>Reporter: xijiawen
>Priority: Major
> Fix For: HBASE-14850
>
> Attachments: HBASE-21177.patch
>
>
> Adds getTime,putTime,scanTime to the per-table mertrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21177) Add per-table metrics on getTime,putTime and scanTime

2018-09-13 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613964#comment-16613964
 ] 

Andrew Purtell commented on HBASE-21177:


This doesn't seem quite right:
{code}
 this.hashCode = this.tableName.hashCode();
+   getHis = registry.newHistogram("Namespace_default_table_" + 
tblName + "_metric_" + GET_REQUEST_TIME,
+   GET_REQUEST_TIME_DESC);
+   putHis = registry.newHistogram("Namespace_default_table_" + 
tblName + "_metric_" + PUT_REQUEST_TIME,
+   PUT_REQUEST_TIME_DESC);
+   scanHis = registry.newHistogram("Namespace_default_table_" + 
tblName + "_metric_" + SCAN_REQUEST_TIME,
+   SCAN_REQUEST_TIME_DESC);
{code}
Shouldn't this code register the metrics using {{tableNamePrefix}} instead of 
{{"Namespace_default_table_" }}?

> Add per-table metrics on getTime,putTime and scanTime
> -
>
> Key: HBASE-21177
> URL: https://issues.apache.org/jira/browse/HBASE-21177
> Project: HBase
>  Issue Type: Task
>  Components: metrics
>Affects Versions: 2.0.2
>Reporter: xijiawen
>Priority: Major
> Fix For: HBASE-14850
>
> Attachments: HBASE-21177.patch
>
>
> Adds getTime,putTime,scanTime to the per-table mertrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-09-13 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613962#comment-16613962
 ] 

Andrew Purtell commented on HBASE-20306:


I don't think the changes are quite what was asked for. The request is for post 
run summary statistics, like the summary reports produced by 
PerformanceEvaluation via getHistogramReport, for example. The patch here 
changes the running status logging as well via use of the refactored 
getOverallRunInformation(), which is not what we want, I think. Just add a 
detailed statistics dump after the run is complete. 

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306.000.patch, HBASE-20306.001.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-16458) Shorten backup / restore test execution time

2018-09-13 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613949#comment-16613949
 ] 

Vladimir Rodionov commented on HBASE-16458:
---

Yes, with tearDown it took 63 min, w/o - 44 min. What tearDown does is 

# Cleaning snapshots
# stopping hbase and yarn mini clusters

I did not spend time on analyzing this. Just an observation.

> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: 16458-v1.patch, 16458.HBASE-7912.v3.txt, 
> 16458.HBASE-7912.v4.txt, 16458.v1.txt, 16458.v2.txt, 16458.v2.txt, 
> 16458.v3.txt, 16458.v4.txt, 16458.v5.txt, HBASE-16458-v1.patch, 
> HBASE-16458-v2.patch
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullRestore
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.711 sec 
> - in org.apache.hadoop.hbase.backup.TestFullRestore
> Running org.apache.hadoop.hbase.backup.TestFullBackupSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.045 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSet
> Running org.apache.hadoop.hbase.backup.TestBackupDelete
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 86.214 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDelete
> Running org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 77.631 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> Running 

[jira] [Commented] (HBASE-16458) Shorten backup / restore test execution time

2018-09-13 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613941#comment-16613941
 ] 

Josh Elser commented on HBASE-16458:


{quote}each test is executed in a separate JVM instance
{quote}
Interesting. Didn't realize we had resuseForks disabled in HBase :). Seems like 
as long as the surefire-plugin is configured this way, your change is fine.
{quote}This actually saved almost 30% of overall execution time.
{quote}
That's.. crazy. Did you dig in to see why this was the case? Curious to know 
how much of it is in "our" code compared to Hadoop's.

> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: 16458-v1.patch, 16458.HBASE-7912.v3.txt, 
> 16458.HBASE-7912.v4.txt, 16458.v1.txt, 16458.v2.txt, 16458.v2.txt, 
> 16458.v3.txt, 16458.v4.txt, 16458.v5.txt, HBASE-16458-v1.patch, 
> HBASE-16458-v2.patch
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullRestore
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.711 sec 
> - in org.apache.hadoop.hbase.backup.TestFullRestore
> Running org.apache.hadoop.hbase.backup.TestFullBackupSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.045 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSet
> Running org.apache.hadoop.hbase.backup.TestBackupDelete
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 86.214 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDelete
> Running 

[jira] [Commented] (HBASE-21182) Failed to execute start-hbase.sh

2018-09-13 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613871#comment-16613871
 ] 

Sean Busbey commented on HBASE-21182:
-

+1 confirmed things work locally here too.

> Failed to execute start-hbase.sh
> 
>
> Key: HBASE-21182
> URL: https://issues.apache.org/jira/browse/HBASE-21182
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0, 2.1.1
>Reporter: Subrat Mishra
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-21182.master.001.patch, 
> HBASE-21182.master.002.patch
>
>
> Built master branch like below:
> {code:java}
> mvn clean install -DskipTests{code}
> Then tried to execute start-hbase.sh failed with NoClassDefFoundError
> {code:java}
> ./bin/start-hbase.sh 
> Error: A JNI error has occurred, please check your installation and try again
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/shaded/org/eclipse/jetty/server/Connector
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
> at java.lang.Class.getMethod0(Class.java:3018)
> at java.lang.Class.getMethod(Class.java:1784)
> at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
> at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.shaded.org.eclipse.jetty.server.Connector{code}
> Note: It worked after reverting HBASE-21153



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21190) Log files and count of entries in each as we load from the MasterProcWAL store

2018-09-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613747#comment-16613747
 ] 

stack commented on HBASE-21190:
---

Thanks for reviews [~allan163] and [~balazs.meszaros]

> Log files and count of entries in each as we load from the MasterProcWAL store
> --
>
> Key: HBASE-21190
> URL: https://issues.apache.org/jira/browse/HBASE-21190
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-21190.branch-2.1.001.patch
>
>
> Sometimes this can take a while especially if loads of files. Emit counts of 
> entries so operator gets sense of scale of procedures being processed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21182) Failed to execute start-hbase.sh

2018-09-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613746#comment-16613746
 ] 

Hadoop QA commented on HBASE-21182:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 9s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
11s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 19s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hbase-assembly in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21182 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939577/HBASE-21182.master.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  shadedjars  hadoopcheck  
xml  compile  |
| uname | Linux ec9dcbd1fc99 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 5d14c1af65 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14408/testReport/ |
| Max. process+thread count | 87 (vs. ulimit of 1) |
| modules | C: hbase-assembly U: hbase-assembly |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14408/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Failed to execute start-hbase.sh
> 
>
> Key: HBASE-21182
> URL: https://issues.apache.org/jira/browse/HBASE-21182
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0, 2.1.1
>Reporter: Subrat Mishra
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: 

[jira] [Commented] (HBASE-21182) Failed to execute start-hbase.sh

2018-09-13 Thread Toshihiro Suzuki (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613703#comment-16613703
 ] 

Toshihiro Suzuki commented on HBASE-21182:
--

Sure [~busbey].

> Failed to execute start-hbase.sh
> 
>
> Key: HBASE-21182
> URL: https://issues.apache.org/jira/browse/HBASE-21182
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0, 2.1.1
>Reporter: Subrat Mishra
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-21182.master.001.patch, 
> HBASE-21182.master.002.patch
>
>
> Built master branch like below:
> {code:java}
> mvn clean install -DskipTests{code}
> Then tried to execute start-hbase.sh failed with NoClassDefFoundError
> {code:java}
> ./bin/start-hbase.sh 
> Error: A JNI error has occurred, please check your installation and try again
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/shaded/org/eclipse/jetty/server/Connector
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
> at java.lang.Class.getMethod0(Class.java:3018)
> at java.lang.Class.getMethod(Class.java:1784)
> at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
> at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.shaded.org.eclipse.jetty.server.Connector{code}
> Note: It worked after reverting HBASE-21153



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21182) Failed to execute start-hbase.sh

2018-09-13 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613701#comment-16613701
 ] 

Sean Busbey edited comment on HBASE-21182 at 9/13/18 4:07 PM:
--

bq. I just attached the v2 patch. I will commit it tomorrow if no objections.

please wait for a sign-off from some committer before pushing.

I'm waiting to see what QABot says.

(Edit because I quoted the wrong bit.)


was (Author: busbey):
bq. I will open other Jiras for the nightly test and the documentation. Thanks.

please wait for a sign-off from some committer before pushing.

I'm waiting to see what QABot says.

> Failed to execute start-hbase.sh
> 
>
> Key: HBASE-21182
> URL: https://issues.apache.org/jira/browse/HBASE-21182
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0, 2.1.1
>Reporter: Subrat Mishra
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-21182.master.001.patch, 
> HBASE-21182.master.002.patch
>
>
> Built master branch like below:
> {code:java}
> mvn clean install -DskipTests{code}
> Then tried to execute start-hbase.sh failed with NoClassDefFoundError
> {code:java}
> ./bin/start-hbase.sh 
> Error: A JNI error has occurred, please check your installation and try again
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/shaded/org/eclipse/jetty/server/Connector
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
> at java.lang.Class.getMethod0(Class.java:3018)
> at java.lang.Class.getMethod(Class.java:1784)
> at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
> at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.shaded.org.eclipse.jetty.server.Connector{code}
> Note: It worked after reverting HBASE-21153



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21182) Failed to execute start-hbase.sh

2018-09-13 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613701#comment-16613701
 ] 

Sean Busbey commented on HBASE-21182:
-

bq. I will open other Jiras for the nightly test and the documentation. Thanks.

please wait for a sign-off from some committer before pushing.

I'm waiting to see what QABot says.

> Failed to execute start-hbase.sh
> 
>
> Key: HBASE-21182
> URL: https://issues.apache.org/jira/browse/HBASE-21182
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0, 2.1.1
>Reporter: Subrat Mishra
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-21182.master.001.patch, 
> HBASE-21182.master.002.patch
>
>
> Built master branch like below:
> {code:java}
> mvn clean install -DskipTests{code}
> Then tried to execute start-hbase.sh failed with NoClassDefFoundError
> {code:java}
> ./bin/start-hbase.sh 
> Error: A JNI error has occurred, please check your installation and try again
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/shaded/org/eclipse/jetty/server/Connector
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
> at java.lang.Class.getMethod0(Class.java:3018)
> at java.lang.Class.getMethod(Class.java:1784)
> at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
> at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.shaded.org.eclipse.jetty.server.Connector{code}
> Note: It worked after reverting HBASE-21153



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21182) Failed to execute start-hbase.sh

2018-09-13 Thread Toshihiro Suzuki (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613696#comment-16613696
 ] 

Toshihiro Suzuki commented on HBASE-21182:
--

 [~busbey] Thank you for reviewing. I just attached the v2 patch. I will commit 
it tomorrow if no objections.

{code}
I strongly suggest someone add a test for it to nightly and probably add a 
paragraph to the "Building Apache HBase" section of the ref guide after the 
advice on how to quickly build a tarball.
{code}
I will open other Jiras for the nightly test and the documentation. Thanks.

> Failed to execute start-hbase.sh
> 
>
> Key: HBASE-21182
> URL: https://issues.apache.org/jira/browse/HBASE-21182
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0, 2.1.1
>Reporter: Subrat Mishra
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-21182.master.001.patch, 
> HBASE-21182.master.002.patch
>
>
> Built master branch like below:
> {code:java}
> mvn clean install -DskipTests{code}
> Then tried to execute start-hbase.sh failed with NoClassDefFoundError
> {code:java}
> ./bin/start-hbase.sh 
> Error: A JNI error has occurred, please check your installation and try again
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/shaded/org/eclipse/jetty/server/Connector
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
> at java.lang.Class.getMethod0(Class.java:3018)
> at java.lang.Class.getMethod(Class.java:1784)
> at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
> at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.shaded.org.eclipse.jetty.server.Connector{code}
> Note: It worked after reverting HBASE-21153



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21182) Failed to execute start-hbase.sh

2018-09-13 Thread Toshihiro Suzuki (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613696#comment-16613696
 ] 

Toshihiro Suzuki edited comment on HBASE-21182 at 9/13/18 4:04 PM:
---

 [~busbey] Thank you for reviewing. I just attached the v2 patch. I will commit 
it tomorrow if no objections.

{quote}
I strongly suggest someone add a test for it to nightly and probably add a 
paragraph to the "Building Apache HBase" section of the ref guide after the 
advice on how to quickly build a tarball.
{quote}
I will open other Jiras for the nightly test and the documentation. Thanks.


was (Author: brfrn169):
 [~busbey] Thank you for reviewing. I just attached the v2 patch. I will commit 
it tomorrow if no objections.

{code}
I strongly suggest someone add a test for it to nightly and probably add a 
paragraph to the "Building Apache HBase" section of the ref guide after the 
advice on how to quickly build a tarball.
{code}
I will open other Jiras for the nightly test and the documentation. Thanks.

> Failed to execute start-hbase.sh
> 
>
> Key: HBASE-21182
> URL: https://issues.apache.org/jira/browse/HBASE-21182
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0, 2.1.1
>Reporter: Subrat Mishra
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-21182.master.001.patch, 
> HBASE-21182.master.002.patch
>
>
> Built master branch like below:
> {code:java}
> mvn clean install -DskipTests{code}
> Then tried to execute start-hbase.sh failed with NoClassDefFoundError
> {code:java}
> ./bin/start-hbase.sh 
> Error: A JNI error has occurred, please check your installation and try again
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/shaded/org/eclipse/jetty/server/Connector
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
> at java.lang.Class.getMethod0(Class.java:3018)
> at java.lang.Class.getMethod(Class.java:1784)
> at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
> at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.shaded.org.eclipse.jetty.server.Connector{code}
> Note: It worked after reverting HBASE-21153



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21182) Failed to execute start-hbase.sh

2018-09-13 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-21182:

Fix Version/s: 2.1.1
   2.2.0
   3.0.0
Affects Version/s: 2.1.1
   2.2.0
   Status: Patch Available  (was: Open)

> Failed to execute start-hbase.sh
> 
>
> Key: HBASE-21182
> URL: https://issues.apache.org/jira/browse/HBASE-21182
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0, 2.1.1
>Reporter: Subrat Mishra
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-21182.master.001.patch, 
> HBASE-21182.master.002.patch
>
>
> Built master branch like below:
> {code:java}
> mvn clean install -DskipTests{code}
> Then tried to execute start-hbase.sh failed with NoClassDefFoundError
> {code:java}
> ./bin/start-hbase.sh 
> Error: A JNI error has occurred, please check your installation and try again
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/shaded/org/eclipse/jetty/server/Connector
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
> at java.lang.Class.getMethod0(Class.java:3018)
> at java.lang.Class.getMethod(Class.java:1784)
> at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
> at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.shaded.org.eclipse.jetty.server.Connector{code}
> Note: It worked after reverting HBASE-21153



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21182) Failed to execute start-hbase.sh

2018-09-13 Thread Toshihiro Suzuki (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toshihiro Suzuki updated HBASE-21182:
-
Attachment: HBASE-21182.master.002.patch

> Failed to execute start-hbase.sh
> 
>
> Key: HBASE-21182
> URL: https://issues.apache.org/jira/browse/HBASE-21182
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Subrat Mishra
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-21182.master.001.patch, 
> HBASE-21182.master.002.patch
>
>
> Built master branch like below:
> {code:java}
> mvn clean install -DskipTests{code}
> Then tried to execute start-hbase.sh failed with NoClassDefFoundError
> {code:java}
> ./bin/start-hbase.sh 
> Error: A JNI error has occurred, please check your installation and try again
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/shaded/org/eclipse/jetty/server/Connector
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
> at java.lang.Class.getMethod0(Class.java:3018)
> at java.lang.Class.getMethod(Class.java:1784)
> at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
> at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.shaded.org.eclipse.jetty.server.Connector{code}
> Note: It worked after reverting HBASE-21153



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20615) emphasize use of shaded client jars when they're present in an install

2018-09-13 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613626#comment-16613626
 ] 

Sean Busbey commented on HBASE-20615:
-

Please open new issues instead of commenting on old ones.

> emphasize use of shaded client jars when they're present in an install
> --
>
> Key: HBASE-20615
> URL: https://issues.apache.org/jira/browse/HBASE-20615
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, Client, Usability
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20615.0.patch, HBASE-20615.1.patch, 
> HBASE-20615.2.patch
>
>
> Working through setting up an IT for our shaded artifacts in HBASE-20334 
> makes our lack of packaging seem like an oversight. While I could work around 
> by pulling the shaded clients out of whatever build process built the 
> convenience binary that we're trying to test, it seems v awkward.
> After reflecting on it more, it makes more sense to me for there to be a 
> common place in the install that folks running jobs against the cluster can 
> rely on. If they need to run without a full hbase install, that should still 
> work fine via e.g. grabbing from the maven repo.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21182) Failed to execute start-hbase.sh

2018-09-13 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613586#comment-16613586
 ] 

Sean Busbey commented on HBASE-21182:
-

{code}
-  jline,jruby-complete
+  
jline,jruby-complete,hbase-shaded-mapreduce
{code}

You should exclude all of the shaded client artifacts.

{quote}
Yes. I ran start-hbase.sh from the source checkout directory after running mvn 
clean install -DskipTests. I usually do this to test my patch. At least, before 
HBASE-21153 I was able to do this. You mean it's unexpected?

...
I think running bin/start-hbase.sh in the source checkout directory is 
expected, because bin/hbase obviously expects it as the following:
...
{quote}

I believe stack uses this same thing, so it's definitely expected.

 If folks want it to keep working reliably, I strongly suggest someone add a 
test for it to nightly and probably add a paragraph to the "Building Apache 
HBase" section of the ref guide after the advice on how to quickly build a 
tarball. The current implementation is brittle and not covered by any checks 
for what would break in an actual deployment. Related, maybe it's time we talk 
about better ways to do "quick" testing of things instead of maintaining this 
shadow of a normal deployment. Something for dev@; no need to block this fix.

> Failed to execute start-hbase.sh
> 
>
> Key: HBASE-21182
> URL: https://issues.apache.org/jira/browse/HBASE-21182
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Subrat Mishra
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-21182.master.001.patch
>
>
> Built master branch like below:
> {code:java}
> mvn clean install -DskipTests{code}
> Then tried to execute start-hbase.sh failed with NoClassDefFoundError
> {code:java}
> ./bin/start-hbase.sh 
> Error: A JNI error has occurred, please check your installation and try again
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/shaded/org/eclipse/jetty/server/Connector
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
> at java.lang.Class.getMethod0(Class.java:3018)
> at java.lang.Class.getMethod(Class.java:1784)
> at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
> at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.shaded.org.eclipse.jetty.server.Connector{code}
> Note: It worked after reverting HBASE-21153



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21182) Failed to execute start-hbase.sh

2018-09-13 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey reassigned HBASE-21182:
---

Assignee: Toshihiro Suzuki

> Failed to execute start-hbase.sh
> 
>
> Key: HBASE-21182
> URL: https://issues.apache.org/jira/browse/HBASE-21182
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Subrat Mishra
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-21182.master.001.patch
>
>
> Built master branch like below:
> {code:java}
> mvn clean install -DskipTests{code}
> Then tried to execute start-hbase.sh failed with NoClassDefFoundError
> {code:java}
> ./bin/start-hbase.sh 
> Error: A JNI error has occurred, please check your installation and try again
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/shaded/org/eclipse/jetty/server/Connector
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
> at java.lang.Class.getMethod0(Class.java:3018)
> at java.lang.Class.getMethod(Class.java:1784)
> at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
> at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.shaded.org.eclipse.jetty.server.Connector{code}
> Note: It worked after reverting HBASE-21153



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21160) Assertion in TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels is ignored

2018-09-13 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613541#comment-16613541
 ] 

Ted Yu commented on HBASE-21160:


If there is no assertion in the try block where Throwable is caught, you don't 
need to change.

> Assertion in 
> TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels
>  is ignored
> ---
>
> Key: HBASE-21160
> URL: https://issues.apache.org/jira/browse/HBASE-21160
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: liubangchen
>Priority: Trivial
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14327/artifact/patchprocess/diff-compile-javac-hbase-server.txt
>  (HBASE-21138 QA run):
> {code}
> [WARNING] 
> /testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDeletes.java:[315,25]
>  [AssertionFailureIgnored] This assertion throws an AssertionError if it 
> fails, which will be caught by an enclosing try block.
> {code}
> Here is related code:
> {code}
>   PrivilegedExceptionAction scanAction = new 
> PrivilegedExceptionAction() {
> @Override
> public Void run() throws Exception {
>   try (Connection connection = 
> ConnectionFactory.createConnection(conf);
> ...
> assertEquals(1, next.length);
>   } catch (Throwable t) {
> throw new IOException(t);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21187) The HBase UTs are extremely slow on some jenkins node

2018-09-13 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613502#comment-16613502
 ] 

Duo Zhang commented on HBASE-21187:
---

We can always success on H4, but on other machines we are likely to fail... And 
there are no big difference between the machines... Strange...

> The HBase UTs are extremely slow on some jenkins node
> -
>
> Key: HBASE-21187
> URL: https://issues.apache.org/jira/browse/HBASE-21187
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: Duo Zhang
>Priority: Major
>
> Looking at the flaky dashboard for master branch, the top several UTs are 
> likely to fail at the same time. One of the common things for the failed 
> flaky tests job is that, the execution time is more than one hour, and the 
> successful executions are usually only about half an hour.
> And I have compared the output for 
> TestRestoreSnapshotFromClientWithRegionReplicas, for a successful run, the 
> DisableTableProcedure can finish within one second, and for the failed run, 
> it can take even more than half a minute.
> Not sure what is the real problem, but it seems that for the failed runs, 
> there are likely time holes in the output, i.e, there is no log output for 
> several seconds. Like this:
> {noformat}
> 2018-09-11 21:08:08,152 INFO  [PEWorker-4] 
> procedure2.ProcedureExecutor(1500): Finished pid=490, state=SUCCESS, 
> hasLock=false; CreateTableProcedure table=testRestoreSnapshotAfterTruncate in 
> 12.9380sec
> 2018-09-11 21:08:15,590 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=1,queue=0,port=33663] 
> master.MasterRpcServices(1174): Checking to see if procedure is done pid=490
> {noformat}
> No log output for about 7 seconds.
> And for a successful run, the same place
> {noformat}
> 2018-09-12 07:47:32,488 INFO  [PEWorker-7] 
> procedure2.ProcedureExecutor(1500): Finished pid=490, state=SUCCESS, 
> hasLock=false; CreateTableProcedure table=testRestoreSnapshotAfterTruncate in 
> 1.2220sec
> 2018-09-12 07:47:32,881 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=59079] 
> master.MasterRpcServices(1174): Checking to see if procedure is done pid=490
> {noformat}
> There is no such hole.
> Maybe there is big GC?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21168) BloomFilterUtil uses hardcoded randomness

2018-09-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613487#comment-16613487
 ] 

Hudson commented on HBASE-21168:


Results for branch master
[build #489 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/489/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/489//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/489//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/489//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> BloomFilterUtil uses hardcoded randomness
> -
>
> Key: HBASE-21168
> URL: https://issues.apache.org/jira/browse/HBASE-21168
> Project: HBase
>  Issue Type: Task
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1
>
> Attachments: HBASE-21168.branch-1.002.patch, 
> HBASE-21168.master.001.patch, HBASE-21168.master.002.patch
>
>
> This was flagged by a Fortify scan and while it doesn't appear to be a real 
> issue, it's pretty easy to take care of anyway.
> The hard coded rand can be moved to the test class that actually needs it to 
> make the static analysis happy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21189) flaky job should gather machine stats

2018-09-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613490#comment-16613490
 ] 

Hudson commented on HBASE-21189:


Results for branch master
[build #489 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/489/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/489//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/489//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/489//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> flaky job should gather machine stats
> -
>
> Key: HBASE-21189
> URL: https://issues.apache.org/jira/browse/HBASE-21189
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-21189.0.patch
>
>
> flaky test should gather all the same environment information as our normal 
> nightly tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21188) Print heap and gc informations in our junit ResourceChecker

2018-09-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613489#comment-16613489
 ] 

Hudson commented on HBASE-21188:


Results for branch master
[build #489 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/489/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/489//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/489//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/489//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Print heap and gc informations in our junit ResourceChecker
> ---
>
> Key: HBASE-21188
> URL: https://issues.apache.org/jira/browse/HBASE-21188
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21188.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21190) Log files and count of entries in each as we load from the MasterProcWAL store

2018-09-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613488#comment-16613488
 ] 

Hudson commented on HBASE-21190:


Results for branch master
[build #489 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/489/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/489//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/489//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/489//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Log files and count of entries in each as we load from the MasterProcWAL store
> --
>
> Key: HBASE-21190
> URL: https://issues.apache.org/jira/browse/HBASE-21190
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-21190.branch-2.1.001.patch
>
>
> Sometimes this can take a while especially if loads of files. Emit counts of 
> entries so operator gets sense of scale of procedures being processed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21160) Assertion in TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels is ignored

2018-09-13 Thread liubangchen (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613394#comment-16613394
 ] 

liubangchen commented on HBASE-21160:
-

Hi [~yuzhih...@gmail.com] I found so many re-throws blocks in the file of 
TestVisibilityLabelsWithDeletes.java . Should we resolve it all?


> Assertion in 
> TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels
>  is ignored
> ---
>
> Key: HBASE-21160
> URL: https://issues.apache.org/jira/browse/HBASE-21160
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: liubangchen
>Priority: Trivial
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14327/artifact/patchprocess/diff-compile-javac-hbase-server.txt
>  (HBASE-21138 QA run):
> {code}
> [WARNING] 
> /testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDeletes.java:[315,25]
>  [AssertionFailureIgnored] This assertion throws an AssertionError if it 
> fails, which will be caught by an enclosing try block.
> {code}
> Here is related code:
> {code}
>   PrivilegedExceptionAction scanAction = new 
> PrivilegedExceptionAction() {
> @Override
> public Void run() throws Exception {
>   try (Connection connection = 
> ConnectionFactory.createConnection(conf);
> ...
> assertEquals(1, next.length);
>   } catch (Throwable t) {
> throw new IOException(t);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21174) [REST] Failed to parse empty qualifier in TableResource#getScanResource

2018-09-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613355#comment-16613355
 ] 

Hudson commented on HBASE-21174:


Results for branch branch-1
[build #459 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> [REST] Failed to parse empty qualifier in TableResource#getScanResource
> ---
>
> Key: HBASE-21174
> URL: https://issues.apache.org/jira/browse/HBASE-21174
> Project: HBase
>  Issue Type: Bug
>  Components: REST
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-21174.branch-1.001.patch, 
> HBASE-21174.master.001.patch, HBASE-21174.master.002.patch
>
>
> {code:xml}
> GET /t1/*?column=f:c1=f:
> {code}
> If I want to get the values of 'f:'(empty qualifier) for all rows in the 
> table by rest server, I will send the above request. However, this request 
> will return all column values.
> {code:java|title=TableResource#getScanResource|borderStyle=solid}
>   for (String csplit : column) {
> String[] familysplit = csplit.trim().split(":");
> if (familysplit.length == 2) {
>   if (familysplit[1].length() > 0) {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Scan family and column : " + familysplit[0] + "  " + 
> familysplit[1]);
> }
> tableScan.addColumn(Bytes.toBytes(familysplit[0]), 
> Bytes.toBytes(familysplit[1]));
>   } else {
> tableScan.addFamily(Bytes.toBytes(familysplit[0]));
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Scan family : " + familysplit[0] + " and empty 
> qualifier.");
> }
> tableScan.addColumn(Bytes.toBytes(familysplit[0]), null);
>   }
> } else if (StringUtils.isNotEmpty(familysplit[0])) {
>   if (LOG.isTraceEnabled()) {
> LOG.trace("Scan family : " + familysplit[0]);
>   }
>   tableScan.addFamily(Bytes.toBytes(familysplit[0]));
> }
>   }
> {code}
> Through the above code, when the column has an empty qualifier, the empty 
> qualifier cannot be parsed correctly.In other words, 'f:'(empty qualifier) 
> and 'f' (column family) are considered to have the same meaning, which is 
> wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21168) BloomFilterUtil uses hardcoded randomness

2018-09-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613356#comment-16613356
 ] 

Hudson commented on HBASE-21168:


Results for branch branch-1
[build #459 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> BloomFilterUtil uses hardcoded randomness
> -
>
> Key: HBASE-21168
> URL: https://issues.apache.org/jira/browse/HBASE-21168
> Project: HBase
>  Issue Type: Task
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1
>
> Attachments: HBASE-21168.branch-1.002.patch, 
> HBASE-21168.master.001.patch, HBASE-21168.master.002.patch
>
>
> This was flagged by a Fortify scan and while it doesn't appear to be a real 
> issue, it's pretty easy to take care of anyway.
> The hard coded rand can be moved to the test class that actually needs it to 
> make the static analysis happy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21179) Fix the number of actions in responseTooSlow log

2018-09-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613359#comment-16613359
 ] 

Hudson commented on HBASE-21179:


Results for branch branch-1
[build #459 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Fix the number of actions in responseTooSlow log
> 
>
> Key: HBASE-21179
> URL: https://issues.apache.org/jira/browse/HBASE-21179
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-21179.branch-1.001.patch, 
> HBASE-21179.master.001.patch, HBASE-21179.master.002.patch
>
>
> {panel:title=responseTooSlow|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE}
> 2018-09-10 16:13:53,022 WARN  
> [B.DefaultRpcServer.handler=209,queue=29,port=60020] ipc.RpcServer: 
> (responseTooSlow): 
> {"processingtimems":321262,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"127.0.0.1:56149","param":"region=
>  
> tsdb,\\x00\\x00.[\\x89\\x1F\\xB0\\x00\\x00\\x01\\x00\\x01Y\\x00\\x00\\x02\\x00\\x00\\x04,1536133210446.7c752de470bd5558a001117b123a5db5.,
>  {color:red}for 1 actions and 1st row{color} 
> key=\\x00\\x00.[\\x96\\x16p","starttimems":1536566911759,"queuetimems":0,"class":"HRegionServer","responsesize":2,"method":"Multi"}
> {panel}
> The responseTooSlow log is printed when the processing time of a request 
> exceeds the specified threshold. The number of actions and the contents of 
> the first rowkey in the request will be included in the log.
> However, the number of actions is inaccurate, and it is actually the number 
> of regions that the request needs to visit.
> Just like the logs above, users may be mistaken for using 321262ms to process 
> an action, which is incredible, so we need to fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21189) flaky job should gather machine stats

2018-09-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613358#comment-16613358
 ] 

Hudson commented on HBASE-21189:


Results for branch branch-1
[build #459 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> flaky job should gather machine stats
> -
>
> Key: HBASE-21189
> URL: https://issues.apache.org/jira/browse/HBASE-21189
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-21189.0.patch
>
>
> flaky test should gather all the same environment information as our normal 
> nightly tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21190) Log files and count of entries in each as we load from the MasterProcWAL store

2018-09-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613357#comment-16613357
 ] 

Hudson commented on HBASE-21190:


Results for branch branch-1
[build #459 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/459//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Log files and count of entries in each as we load from the MasterProcWAL store
> --
>
> Key: HBASE-21190
> URL: https://issues.apache.org/jira/browse/HBASE-21190
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-21190.branch-2.1.001.patch
>
>
> Sometimes this can take a while especially if loads of files. Emit counts of 
> entries so operator gets sense of scale of procedures being processed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21189) flaky job should gather machine stats

2018-09-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613346#comment-16613346
 ] 

Hudson commented on HBASE-21189:


Results for branch branch-2
[build #1242 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1242/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1242//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1242//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1242//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> flaky job should gather machine stats
> -
>
> Key: HBASE-21189
> URL: https://issues.apache.org/jira/browse/HBASE-21189
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-21189.0.patch
>
>
> flaky test should gather all the same environment information as our normal 
> nightly tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21189) flaky job should gather machine stats

2018-09-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613338#comment-16613338
 ] 

Hudson commented on HBASE-21189:


Results for branch branch-2.1
[build #318 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/318/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/318//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/318//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/318//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> flaky job should gather machine stats
> -
>
> Key: HBASE-21189
> URL: https://issues.apache.org/jira/browse/HBASE-21189
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-21189.0.patch
>
>
> flaky test should gather all the same environment information as our normal 
> nightly tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20941) Create and implement HbckService in master

2018-09-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613339#comment-16613339
 ] 

Hudson commented on HBASE-20941:


Results for branch branch-2.1
[build #318 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/318/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/318//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/318//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/318//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Create and implement HbckService in master
> --
>
> Key: HBASE-20941
> URL: https://issues.apache.org/jira/browse/HBASE-20941
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.2
>
> Attachments: hbase-20941.master.001.patch, 
> hbase-20941.master.002.patch, hbase-20941.master.003.patch, 
> hbase-20941.master.004.patch, hbase-20941.master.004.patch, 
> hbase-20941.master.004.patch
>
>
> Create HbckService in master and implement following methods:
>  # setTableState(): If table state are inconsistent with action/ procedures 
> working on them, sometimes manipulating their states in meta fix things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21189) flaky job should gather machine stats

2018-09-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613330#comment-16613330
 ] 

Hudson commented on HBASE-21189:


Results for branch branch-2.0
[build #808 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/808/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/808//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/808//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/808//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> flaky job should gather machine stats
> -
>
> Key: HBASE-21189
> URL: https://issues.apache.org/jira/browse/HBASE-21189
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-21189.0.patch
>
>
> flaky test should gather all the same environment information as our normal 
> nightly tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21185) WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes

2018-09-13 Thread Wellington Chevreuil (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613324#comment-16613324
 ] 

Wellington Chevreuil commented on HBASE-21185:
--

Thanks [~allan163]. Decided to go with *estimatedSizeOfCell,* as it will 
already use *heapSize* implementation of the cell internally.

> WALPrettyPrinter: Additional useful info to be printed by wal printer tool, 
> for debugability purposes
> -
>
> Key: HBASE-21185
> URL: https://issues.apache.org/jira/browse/HBASE-21185
> Project: HBase
>  Issue Type: Improvement
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Trivial
> Attachments: HBASE-21185.master.001.patch, 
> HBASE-21185.master.002.patch
>
>
> *WALPrettyPrinter* is very useful for troubleshooting wal issues, such as 
> faulty replication sinks. An useful information one might want to track is 
> the size of a single WAL entry edit, as well as size for each edit cell. Am 
> proposing a patch that adds calculations for these two, as well an option to 
> seek straight to a given position on the WAL file being analysed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21185) WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes

2018-09-13 Thread Wellington Chevreuil (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-21185:
-
Attachment: HBASE-21185.master.002.patch

> WALPrettyPrinter: Additional useful info to be printed by wal printer tool, 
> for debugability purposes
> -
>
> Key: HBASE-21185
> URL: https://issues.apache.org/jira/browse/HBASE-21185
> Project: HBase
>  Issue Type: Improvement
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Trivial
> Attachments: HBASE-21185.master.001.patch, 
> HBASE-21185.master.002.patch
>
>
> *WALPrettyPrinter* is very useful for troubleshooting wal issues, such as 
> faulty replication sinks. An useful information one might want to track is 
> the size of a single WAL entry edit, as well as size for each edit cell. Am 
> proposing a patch that adds calculations for these two, as well an option to 
> seek straight to a given position on the WAL file being analysed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21179) Fix the number of actions in responseTooSlow log

2018-09-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613285#comment-16613285
 ] 

Hudson commented on HBASE-21179:


Results for branch branch-1.3
[build #466 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/466/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/466//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/466//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/466//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Fix the number of actions in responseTooSlow log
> 
>
> Key: HBASE-21179
> URL: https://issues.apache.org/jira/browse/HBASE-21179
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-21179.branch-1.001.patch, 
> HBASE-21179.master.001.patch, HBASE-21179.master.002.patch
>
>
> {panel:title=responseTooSlow|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE}
> 2018-09-10 16:13:53,022 WARN  
> [B.DefaultRpcServer.handler=209,queue=29,port=60020] ipc.RpcServer: 
> (responseTooSlow): 
> {"processingtimems":321262,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"127.0.0.1:56149","param":"region=
>  
> tsdb,\\x00\\x00.[\\x89\\x1F\\xB0\\x00\\x00\\x01\\x00\\x01Y\\x00\\x00\\x02\\x00\\x00\\x04,1536133210446.7c752de470bd5558a001117b123a5db5.,
>  {color:red}for 1 actions and 1st row{color} 
> key=\\x00\\x00.[\\x96\\x16p","starttimems":1536566911759,"queuetimems":0,"class":"HRegionServer","responsesize":2,"method":"Multi"}
> {panel}
> The responseTooSlow log is printed when the processing time of a request 
> exceeds the specified threshold. The number of actions and the contents of 
> the first rowkey in the request will be included in the log.
> However, the number of actions is inaccurate, and it is actually the number 
> of regions that the request needs to visit.
> Just like the logs above, users may be mistaken for using 321262ms to process 
> an action, which is incredible, so we need to fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21190) Log files and count of entries in each as we load from the MasterProcWAL store

2018-09-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613283#comment-16613283
 ] 

Hudson commented on HBASE-21190:


Results for branch branch-1.3
[build #466 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/466/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/466//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/466//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/466//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Log files and count of entries in each as we load from the MasterProcWAL store
> --
>
> Key: HBASE-21190
> URL: https://issues.apache.org/jira/browse/HBASE-21190
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-21190.branch-2.1.001.patch
>
>
> Sometimes this can take a while especially if loads of files. Emit counts of 
> entries so operator gets sense of scale of procedures being processed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >