[jira] [Commented] (HBASE-20668) Avoid permission change if ExportSnapshot's copy fails

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498879#comment-16498879
 ] 

Hudson commented on HBASE-20668:


Results for branch branch-2
[build #812 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/812/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/812//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/812//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/812//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Avoid permission change if ExportSnapshot's copy fails
> --
>
> Key: HBASE-20668
> URL: https://issues.apache.org/jira/browse/HBASE-20668
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 2.1.0
>
> Attachments: 20668.v1.txt, 20668.v2.txt
>
>
> The exception from FileSystem operation in the finally block of 
> ExportSnapshot#doWork may hide the real exception from FileUtil.copy call.
> I was debugging the following error [~romil.choksi] saw during testing 
> ExportSnapshot :
> {code:java}
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|2018-06-01 02:40:52,358 
> ERROR [main] util.AbstractHBaseTool: Error  running command-line tool
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|java.io.FileNotFoundException:
>  Directory/File does not exist /apps/ 
> hbase/data/.hbase-snapshot/.tmp/snapshot_table_334546
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.  
> checkOwner(FSDirectory.java:1777)
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.  
> setOwner(FSDirAttrOp.java:82)
> {code}
> Here is corresponding code (with extra log added):
> {code:java}
> try {
>   LOG.info("Copy Snapshot Manifest from " + snapshotDir + " to " + 
> initialOutputSnapshotDir);
>   boolean ret = FileUtil.copy(inputFs, snapshotDir, outputFs, 
> initialOutputSnapshotDir, false,
>   false, conf);
>   LOG.info("return val = " + ret);
> } catch (IOException e) {
>   LOG.warn("Failed to copy the snapshot directory: from=" +
>   snapshotDir + " to=" + initialOutputSnapshotDir, e);
>   throw new ExportSnapshotException("Failed to copy the snapshot 
> directory: from=" +
> snapshotDir + " to=" + initialOutputSnapshotDir, e);
> } finally {
>   if (filesUser != null || filesGroup != null) {
> LOG.warn((filesUser == null ? "" : "Change the owner of " + 
> needSetOwnerDir + " to "
> + filesUser)
> + (filesGroup == null ? "" : ", Change the group of " + 
> needSetOwnerDir + " to "
> + filesGroup));
> setOwner(outputFs, needSetOwnerDir, filesUser, filesGroup, true);
>   }
> {code}
> "return val = " was not seen in rerun of the test.
>  This is what the additional log revealed:
> {code:java}
> 2018-06-01 09:22:54,247|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|2018-06-01 09:22:54,241 WARN 
>  [main] snapshot.ExportSnapshot: Failed to copy the snapshot directory: 
> from=hdfs://ns1/apps/hbase/data/.hbase-snapshot/snapshot_table_157842 
> to=hdfs://ns3/apps/hbase/data/.hbase-snapshot/.tmp/snapshot_table_157842
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|org.apache.hadoop.security.AccessControlException:
>  Permission denied:   user=hbase, access=WRITE, 
> inode="/apps/hbase/data/.hbase-snapshot/.tmp":hrt_qa:hadoop:drx-wT
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> check(FSPermissionChecker.java:399)
> 2018-06-01 09:22:54,249|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> 

[jira] [Commented] (HBASE-20667) Rename TestGlobalThrottler to TestReplicationGlobalThrottler

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498880#comment-16498880
 ] 

Hudson commented on HBASE-20667:


Results for branch branch-2
[build #812 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/812/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/812//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/812//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/812//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Rename TestGlobalThrottler to TestReplicationGlobalThrottler
> 
>
> Key: HBASE-20667
> URL: https://issues.apache.org/jira/browse/HBASE-20667
> Project: HBase
>  Issue Type: Test
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 2.1.0, 1.5.0, 2.0.1
>
> Attachments: HBASE-20667.patch
>
>
> If running replication unit tests , perhaps like  {{mvn test 
> -Dtest=TestReplication\*,Test\*Replication\*}} , then you will miss 
> TestGlobalThrottler. This should be renamed to TestReplicationGlobalThrottler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18116) Replication source in-memory accounting should not include bulk transfer hfiles

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498878#comment-16498878
 ] 

Hudson commented on HBASE-18116:


Results for branch branch-2
[build #812 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/812/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/812//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/812//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/812//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Replication source in-memory accounting should not include bulk transfer 
> hfiles
> ---
>
> Key: HBASE-18116
> URL: https://issues.apache.org/jira/browse/HBASE-18116
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0
>
> Attachments: HBASE-18116.master.001.patch, 
> HBASE-18116.master.002.patch, HBASE-18116.master.003.patch
>
>
> In ReplicationSourceWALReaderThread we maintain a global quota on enqueued 
> replication work for preventing OOM by queuing up too many edits into queues 
> on heap. When calculating the size of a given replication queue entry, if it 
> has associated hfiles (is a bulk load to be replicated as a batch of hfiles), 
> we get the file sizes and include the sum. We then apply that result to the 
> quota. This isn't quite right. Those hfiles will be pulled by the sink as a 
> file copy, not pushed by the source. The cells in those files are not queued 
> in memory at the source and therefore shouldn't be counted against the quota.
> Related, the sum of the hfile sizes are also included when checking if queued 
> work exceeds the configured replication queue capacity, which is by default 
> 64 MB. HFiles are commonly much larger than this. 
> So what happens is when we encounter a bulk load replication entry typically 
> both the quota and capacity limits are exceeded, we break out of loops, and 
> send right away. What is transferred on the wire via HBase RPC though has 
> only a partial relationship to the calculation. 
> Depending how you look at it, it makes sense to factor hfile file sizes 
> against replication queue capacity limits. The sink will be occupied 
> transferring those files at the HDFS level. Anyway, this is how we have been 
> doing it and it is too late to change now. I do not however think it is 
> correct to apply hfile file sizes against a quota for in memory state on the 
> source. The source doesn't queue or even transfer those bytes. 
> Something I noticed while working on HBASE-18027.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20673) Reduce the number of Cell implementations; the profusion is distracting to users and JIT

2018-06-01 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498861#comment-16498861
 ] 

stack commented on HBASE-20673:
---

Edits welcome!

> Reduce the number of Cell implementations; the profusion is distracting to 
> users and JIT
> 
>
> Key: HBASE-20673
> URL: https://issues.apache.org/jira/browse/HBASE-20673
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 0001-current.patch, 0001-current.patch, hits.20673.png
>
>
> We have a wild blossom of Cell implementations in hbase. Purge the bulk of 
> them. Make it so we do one type well. JIT gets confused if it has an abstract 
> or an interface and then the instantiated classes are myriad (megamorphic).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20674) clean up short circuit read logic and docs

2018-06-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498844#comment-16498844
 ] 

Hadoop QA commented on HBASE-20674:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
17s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  4m 
38s{color} | {color:blue} branch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
47s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
1s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  1m 
43s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  2m 
25s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  2m 25s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
19s{color} | {color:green} root: The patch generated 0 new + 160 unchanged - 10 
fixed = 160 total (was 170) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  4m 
53s{color} | {color:blue} patch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
| {color:red}-1{color} | {color:red} shadedjars {color} | {color:red}  3m 
16s{color} | {color:red} patch has 18 errors when building our shaded 
downstream artifacts. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  1m 
38s{color} | {color:red} The patch causes 18 errors with Hadoop v2.7.4. {color} 
|
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
29s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 
1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  2m 
30s{color} | {color:red} root generated 1 new + 27 unchanged - 0 fixed = 28 
total (was 27) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 10m 13s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 

[jira] [Commented] (HBASE-20670) NPE in HMaster#isInMaintenanceMode

2018-06-01 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498839#comment-16498839
 ] 

Ted Yu commented on HBASE-20670:


+1

> NPE in HMaster#isInMaintenanceMode
> --
>
> Key: HBASE-20670
> URL: https://issues.apache.org/jira/browse/HBASE-20670
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 2.0.1, 1.4.5
>
> Attachments: HBASE-20670-branch-1.patch, HBASE-20670.patch
>
>
> {noformat}
> Problem accessing /master-status. Reason: INTERNAL_SERVER_ERROR
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.master.HMaster.isInMaintenanceMode(HMaster.java:2559)
> {noformat}
> The ZK trackers, including the maintenance mode tracker, are initialized only 
> after we try to bring up the filesystem. If HDFS is in safe mode and the 
> master is waiting on that, when an access to the master status page comes in 
> we trip over this problem. There might be other issues after we fix this, but 
> NPE Is always a bug, so let's address it. One option is to connect the ZK 
> based components with ZK before attempting to bring up the filesystem. Let me 
> try that first. If that doesn't work we could at least throw an IOE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20673) Reduce the number of Cell implementations; the profusion is distracting to users and JIT

2018-06-01 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498836#comment-16498836
 ] 

Sean Busbey commented on HBASE-20673:
-

pardon the clarifications. I was getting hung up on Cell.Type vs things that 
implement the Cell interface

> Reduce the number of Cell implementations; the profusion is distracting to 
> users and JIT
> 
>
> Key: HBASE-20673
> URL: https://issues.apache.org/jira/browse/HBASE-20673
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 0001-current.patch, 0001-current.patch, hits.20673.png
>
>
> We have a wild blossom of Cell implementations in hbase. Purge the bulk of 
> them. Make it so we do one type well. JIT gets confused if it has an abstract 
> or an interface and then the instantiated classes are myriad (megamorphic).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20673) Reduce the number of Cell implementations; the profusion is distracting to users and JIT

2018-06-01 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20673:

Description: We have a wild blossom of Cell implementations in hbase. Purge 
the bulk of them. Make it so we do one type well. JIT gets confused if it has 
an abstract or an interface and then the instantiated classes are myriad 
(megamorphic).  (was: We have a wild blossom of Cell types in hbase. The Cell 
Interface has many implementors. Purge the bulk of them. Make it so we do one 
type well. JIT gets confused if it has an abstract or an interface and then the 
types are myriad (megamorphic).)

> Reduce the number of Cell implementations; the profusion is distracting to 
> users and JIT
> 
>
> Key: HBASE-20673
> URL: https://issues.apache.org/jira/browse/HBASE-20673
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 0001-current.patch, 0001-current.patch, hits.20673.png
>
>
> We have a wild blossom of Cell implementations in hbase. Purge the bulk of 
> them. Make it so we do one type well. JIT gets confused if it has an abstract 
> or an interface and then the instantiated classes are myriad (megamorphic).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20673) Reduce the number of Cell implementations; the profusion is distracting to users and JIT

2018-06-01 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20673:

Summary: Reduce the number of Cell implementations; the profusion is 
distracting to users and JIT  (was: Purge Cell Types; the profusion is 
distracting to users and JIT)

> Reduce the number of Cell implementations; the profusion is distracting to 
> users and JIT
> 
>
> Key: HBASE-20673
> URL: https://issues.apache.org/jira/browse/HBASE-20673
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 0001-current.patch, 0001-current.patch, hits.20673.png
>
>
> We have a wild blossom of Cell types in hbase. The Cell Interface has many 
> implementors. Purge the bulk of them. Make it so we do one type well. JIT 
> gets confused if it has an abstract or an interface and then the types are 
> myriad (megamorphic).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20670) NPE in HMaster#isInMaintenanceMode

2018-06-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498817#comment-16498817
 ] 

Hadoop QA commented on HBASE-20670:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
59s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
11s{color} | {color:red} hbase-server: The patch generated 1 new + 223 
unchanged - 0 fixed = 224 total (was 223) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
49s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 29s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}111m 
56s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}154m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-20670 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12926199/HBASE-20670.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 68dbabd7ad7f 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 74ef118e9e |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13057/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13057/testReport/ |
| Max. process+thread 

[jira] [Updated] (HBASE-20674) clean up short circuit read logic and docs

2018-06-01 Thread Mike Drob (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-20674:
--
Status: Patch Available  (was: Open)

Lets see what QA does with this patch.

> clean up short circuit read logic and docs
> --
>
> Key: HBASE-20674
> URL: https://issues.apache.org/jira/browse/HBASE-20674
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-20674.patch
>
>
> Mailing list discussion at 
> https://lists.apache.org/thread.html/f6f73df0ceae29f762f9b9088e3ffd0bf8f109d3dd692df100bf4fd6@%3Cdev.hbase.apache.org%3E
> There are several inconsistencies between how our docs claim we do things and 
> how we actually do things.
> There are two docs sections that attempt to address how SCR should work.
> dfs.client.read.shortcircuit.skip.checksum is advised to set to true, but our 
> code in separate places ignores it and then later sets it to true anyway.
> CommonFSUtils and FSUtils duplicate code related to SCR setup.
> There is a workaround in HFileSystem for a bug that's been fixed in all 
> versions of hadoop that we support. (HADOOP-9307)
> We suggest setting dfs.client.read.shortcircuit.buffer.size to a value that 
> is very close to what we'd set it to anyway, without clearly explaining why 
> this is important.
> There are other properties that we claim are important, but we don't offer 
> any suggestions or explanations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20674) clean up short circuit read logic and docs

2018-06-01 Thread Mike Drob (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-20674:
--
Attachment: HBASE-20674.patch

> clean up short circuit read logic and docs
> --
>
> Key: HBASE-20674
> URL: https://issues.apache.org/jira/browse/HBASE-20674
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-20674.patch
>
>
> Mailing list discussion at 
> https://lists.apache.org/thread.html/f6f73df0ceae29f762f9b9088e3ffd0bf8f109d3dd692df100bf4fd6@%3Cdev.hbase.apache.org%3E
> There are several inconsistencies between how our docs claim we do things and 
> how we actually do things.
> There are two docs sections that attempt to address how SCR should work.
> dfs.client.read.shortcircuit.skip.checksum is advised to set to true, but our 
> code in separate places ignores it and then later sets it to true anyway.
> CommonFSUtils and FSUtils duplicate code related to SCR setup.
> There is a workaround in HFileSystem for a bug that's been fixed in all 
> versions of hadoop that we support. (HADOOP-9307)
> We suggest setting dfs.client.read.shortcircuit.buffer.size to a value that 
> is very close to what we'd set it to anyway, without clearly explaining why 
> this is important.
> There are other properties that we claim are important, but we don't offer 
> any suggestions or explanations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20674) clean up short circuit read logic and docs

2018-06-01 Thread Mike Drob (JIRA)
Mike Drob created HBASE-20674:
-

 Summary: clean up short circuit read logic and docs
 Key: HBASE-20674
 URL: https://issues.apache.org/jira/browse/HBASE-20674
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 2.0.0
Reporter: Mike Drob
Assignee: Mike Drob


Mailing list discussion at 
https://lists.apache.org/thread.html/f6f73df0ceae29f762f9b9088e3ffd0bf8f109d3dd692df100bf4fd6@%3Cdev.hbase.apache.org%3E

There are several inconsistencies between how our docs claim we do things and 
how we actually do things.

There are two docs sections that attempt to address how SCR should work.

dfs.client.read.shortcircuit.skip.checksum is advised to set to true, but our 
code in separate places ignores it and then later sets it to true anyway.

CommonFSUtils and FSUtils duplicate code related to SCR setup.

There is a workaround in HFileSystem for a bug that's been fixed in all 
versions of hadoop that we support. (HADOOP-9307)

We suggest setting dfs.client.read.shortcircuit.buffer.size to a value that is 
very close to what we'd set it to anyway, without clearly explaining why this 
is important.

There are other properties that we claim are important, but we don't offer any 
suggestions or explanations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20667) Rename TestGlobalThrottler to TestReplicationGlobalThrottler

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498804#comment-16498804
 ] 

Hudson commented on HBASE-20667:


Results for branch branch-2.0
[build #377 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/377/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/377//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/377//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/377//JDK8_Nightly_Build_Report_(Hadoop3)/]


(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Rename TestGlobalThrottler to TestReplicationGlobalThrottler
> 
>
> Key: HBASE-20667
> URL: https://issues.apache.org/jira/browse/HBASE-20667
> Project: HBase
>  Issue Type: Test
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 2.1.0, 1.5.0, 2.0.1
>
> Attachments: HBASE-20667.patch
>
>
> If running replication unit tests , perhaps like  {{mvn test 
> -Dtest=TestReplication\*,Test\*Replication\*}} , then you will miss 
> TestGlobalThrottler. This should be renamed to TestReplicationGlobalThrottler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20634) Reopen region while server crash can cause the procedure to be stuck

2018-06-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498782#comment-16498782
 ] 

Hadoop QA commented on HBASE-20634:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 3s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
10s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
14s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
49s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 14m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 
26s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
9s{color} | {color:red} hbase-server: The patch generated 6 new + 108 unchanged 
- 2 fixed = 114 total (was 110) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
19s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 42s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
30s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
43s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 22m  0s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}100m  5s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.TestClientClusterMetrics |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:369877d |
| JIRA Issue | HBASE-20634 |
| JIRA Patch URL | 

[jira] [Updated] (HBASE-20667) Rename TestGlobalThrottler to TestReplicationGlobalThrottler

2018-06-01 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20667:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> Rename TestGlobalThrottler to TestReplicationGlobalThrottler
> 
>
> Key: HBASE-20667
> URL: https://issues.apache.org/jira/browse/HBASE-20667
> Project: HBase
>  Issue Type: Test
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 2.1.0, 1.5.0, 2.0.1
>
> Attachments: HBASE-20667.patch
>
>
> If running replication unit tests , perhaps like  {{mvn test 
> -Dtest=TestReplication\*,Test\*Replication\*}} , then you will miss 
> TestGlobalThrottler. This should be renamed to TestReplicationGlobalThrottler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20657) Retrying RPC call for ModifyTableProcedure may get stuck

2018-06-01 Thread Sergey Soldatov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Soldatov updated HBASE-20657:

Attachment: HBASE-20657-testcase-branch2.patch

> Retrying RPC call for ModifyTableProcedure may get stuck
> 
>
> Key: HBASE-20657
> URL: https://issues.apache.org/jira/browse/HBASE-20657
> Project: HBase
>  Issue Type: Bug
>  Components: Client, proc-v2
>Affects Versions: 2.0.0
>Reporter: Sergey Soldatov
>Assignee: Sergey Soldatov
>Priority: Major
> Attachments: HBASE-20657-testcase-branch2.patch
>
>
> Env: 2 masters, 1 RS. 
> Steps to reproduce: Active master is killed while ModifyTableProcedure is 
> executed. 
> If the table has enough regions it may come that when the secondary master 
> get active some of the regions may be closed, so once client retries the call 
> to the new active master, a new ModifyTableProcedure is created and get stuck 
> during MODIFY_TABLE_REOPEN_ALL_REGIONS state handling. That happens because:
> 1. When we are retrying from client side, we call modifyTableAsync which 
> create a procedure with a new nonce key:
> {noformat}
>  ModifyTableRequest request = 
> RequestConverter.buildModifyTableRequest(
> td.getTableName(), td, ng.getNonceGroup(), ng.newNonce());
> {noformat}
>  So on the server side, it's considered as a new procedure and starts 
> executing immediately.
> 2. When we are processing  MODIFY_TABLE_REOPEN_ALL_REGIONS we create 
> MoveRegionProcedure for each region, but it checks whether the region is 
> online (and it's not), so it fails immediately, forcing the procedure to 
> restart.
> [~an...@apache.org] saw a similar case when two concurrent ModifyTable 
> procedures were running and got stuck in the similar way. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20657) Retrying RPC call for ModifyTableProcedure may get stuck

2018-06-01 Thread Sergey Soldatov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498727#comment-16498727
 ] 

Sergey Soldatov commented on HBASE-20657:
-

Let's consider the case that [~an...@apache.org] found. Attaching the test case 
to reproduce the problem.
What happens:
1. First ModifyTableProcedure (MTP) gets the exclusive lock, performs all meta 
modifications, creates subprocedures to reopen all regions. All those 
subprocedures are going to TableQueue. After that, it releases the lock. 
2. Second MTP gets the exclusive lock, performs all meta modifications and 
tries to create subprocedures. Since during the reopen we create 
MoveRegionProcedure, we immediately fail in the constructor of 
MoveRegionProcedure, because of some regions may be closed at that moment 
because of the first MTP. 
3. Since the lock was not released in (2), the scheduler will continuously try 
to execute the second MTP, failing over and over again. 
[~stack]  that's the problem I've tried to describe in the discussion in 
HBASE-20202. There are several directions (not ways, because none of them 
actually work) to solve it:
1. remove the call of openRegion in MoveRegionProcedure constructor. Still will 
have a problem that during the execution of subprocedures of the first MTP, 
metadata will be updated with new values from the second MTP  and would start 
failing during the open regions (like in the example - the report will be that 
compression is incorrect).
2. make MTP holding a lock. Doesn't work as well. Even for a single MTP call, 
it gets stuck if there is more than one worker for proc-v2. Reason - 
MasterProcedureScheduler#doPoll :
https://github.com/apache/hbase/blob/74ef118e9e2246c09280ebb7eb6552ef91bdd094/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L200
Since MTP creates 3 levels of subprocedures, due the concurrent execution, the 
queue may have a mix of subprocedures that have different parents and all 
workers may remove that table queue from the fairqueue at the same time, so 
there will be nothing to execute for them and some regions would get stuck in 
RiT state. Not sure, but there is a chance that this is a side effect of 
HBASE-2, so FYI [~Apache9] 

Another question is related to (1) from the description. Is it expected, that 
during the retry from client we generate a new nonce key for the same 
procedure? 

Any thoughts, suggestions or directions are very appreciated. 



Also FYI [~an...@apache.org], [~elserj]

> Retrying RPC call for ModifyTableProcedure may get stuck
> 
>
> Key: HBASE-20657
> URL: https://issues.apache.org/jira/browse/HBASE-20657
> Project: HBase
>  Issue Type: Bug
>  Components: Client, proc-v2
>Affects Versions: 2.0.0
>Reporter: Sergey Soldatov
>Assignee: Sergey Soldatov
>Priority: Major
> Attachments: HBASE-20657-testcase-branch2.patch
>
>
> Env: 2 masters, 1 RS. 
> Steps to reproduce: Active master is killed while ModifyTableProcedure is 
> executed. 
> If the table has enough regions it may come that when the secondary master 
> get active some of the regions may be closed, so once client retries the call 
> to the new active master, a new ModifyTableProcedure is created and get stuck 
> during MODIFY_TABLE_REOPEN_ALL_REGIONS state handling. That happens because:
> 1. When we are retrying from client side, we call modifyTableAsync which 
> create a procedure with a new nonce key:
> {noformat}
>  ModifyTableRequest request = 
> RequestConverter.buildModifyTableRequest(
> td.getTableName(), td, ng.getNonceGroup(), ng.newNonce());
> {noformat}
>  So on the server side, it's considered as a new procedure and starts 
> executing immediately.
> 2. When we are processing  MODIFY_TABLE_REOPEN_ALL_REGIONS we create 
> MoveRegionProcedure for each region, but it checks whether the region is 
> online (and it's not), so it fails immediately, forcing the procedure to 
> restart.
> [~an...@apache.org] saw a similar case when two concurrent ModifyTable 
> procedures were running and got stuck in the similar way. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20667) Rename TestGlobalThrottler to TestReplicationGlobalThrottler

2018-06-01 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20667:
---
Fix Version/s: 2.0.1

> Rename TestGlobalThrottler to TestReplicationGlobalThrottler
> 
>
> Key: HBASE-20667
> URL: https://issues.apache.org/jira/browse/HBASE-20667
> Project: HBase
>  Issue Type: Test
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 2.1.0, 1.5.0, 2.0.1
>
> Attachments: HBASE-20667.patch
>
>
> If running replication unit tests , perhaps like  {{mvn test 
> -Dtest=TestReplication\*,Test\*Replication\*}} , then you will miss 
> TestGlobalThrottler. This should be renamed to TestReplicationGlobalThrottler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20662) Increasing space quota on a violated table does not remove SpaceViolationPolicy.DISABLE enforcement

2018-06-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498717#comment-16498717
 ] 

Hadoop QA commented on HBASE-20662:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
 3s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} hbase-client: The patch generated 0 new + 5 
unchanged - 1 fixed = 5 total (was 6) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
19s{color} | {color:red} hbase-server: The patch generated 1 new + 170 
unchanged - 4 fixed = 171 total (was 174) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
44s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
12m 43s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
21s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}169m 
25s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
49s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}234m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-20662 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12926149/HBASE-20662.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 66b8dacd7a70 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Comment Edited] (HBASE-20667) Rename TestGlobalThrottler to TestReplicationGlobalThrottler

2018-06-01 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498682#comment-16498682
 ] 

Andrew Purtell edited comment on HBASE-20667 at 6/1/18 11:40 PM:
-

I suppose so but some of us are still in the stone age with branch-1 (barely) :)
Edit: The category of this test is already correct


was (Author: apurtell):
I suppose so but some of us are still in the stone age with branch-1 (barely) :)

> Rename TestGlobalThrottler to TestReplicationGlobalThrottler
> 
>
> Key: HBASE-20667
> URL: https://issues.apache.org/jira/browse/HBASE-20667
> Project: HBase
>  Issue Type: Test
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 2.1.0, 1.5.0
>
> Attachments: HBASE-20667.patch
>
>
> If running replication unit tests , perhaps like  {{mvn test 
> -Dtest=TestReplication\*,Test\*Replication\*}} , then you will miss 
> TestGlobalThrottler. This should be renamed to TestReplicationGlobalThrottler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20670) NPE in HMaster#isInMaintenanceMode

2018-06-01 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498712#comment-16498712
 ] 

Andrew Purtell commented on HBASE-20670:


Throw a PleaseHoldException from HMaster.isInMaintenanceMode. Just fixing the 
NPE. 

> NPE in HMaster#isInMaintenanceMode
> --
>
> Key: HBASE-20670
> URL: https://issues.apache.org/jira/browse/HBASE-20670
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 2.0.1, 1.4.5
>
> Attachments: HBASE-20670-branch-1.patch, HBASE-20670.patch
>
>
> {noformat}
> Problem accessing /master-status. Reason: INTERNAL_SERVER_ERROR
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.master.HMaster.isInMaintenanceMode(HMaster.java:2559)
> {noformat}
> The ZK trackers, including the maintenance mode tracker, are initialized only 
> after we try to bring up the filesystem. If HDFS is in safe mode and the 
> master is waiting on that, when an access to the master status page comes in 
> we trip over this problem. There might be other issues after we fix this, but 
> NPE Is always a bug, so let's address it. One option is to connect the ZK 
> based components with ZK before attempting to bring up the filesystem. Let me 
> try that first. If that doesn't work we could at least throw an IOE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20670) NPE in HMaster#isInMaintenanceMode

2018-06-01 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20670:
---
Status: Patch Available  (was: Open)

> NPE in HMaster#isInMaintenanceMode
> --
>
> Key: HBASE-20670
> URL: https://issues.apache.org/jira/browse/HBASE-20670
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 1.5.0, 1.3.3, 1.4.5
>
> Attachments: HBASE-20670-branch-1.patch, HBASE-20670.patch
>
>
> {noformat}
> Problem accessing /master-status. Reason: INTERNAL_SERVER_ERROR
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.master.HMaster.isInMaintenanceMode(HMaster.java:2559)
> {noformat}
> The ZK trackers, including the maintenance mode tracker, are initialized only 
> after we try to bring up the filesystem. If HDFS is in safe mode and the 
> master is waiting on that, when an access to the master status page comes in 
> we trip over this problem. There might be other issues after we fix this, but 
> NPE Is always a bug, so let's address it. One option is to connect the ZK 
> based components with ZK before attempting to bring up the filesystem. Let me 
> try that first. If that doesn't work we could at least throw an IOE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20670) NPE in HMaster#isInMaintenanceMode

2018-06-01 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20670:
---
Attachment: HBASE-20670.patch
HBASE-20670-branch-1.patch

> NPE in HMaster#isInMaintenanceMode
> --
>
> Key: HBASE-20670
> URL: https://issues.apache.org/jira/browse/HBASE-20670
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 2.0.1, 1.4.5
>
> Attachments: HBASE-20670-branch-1.patch, HBASE-20670.patch
>
>
> {noformat}
> Problem accessing /master-status. Reason: INTERNAL_SERVER_ERROR
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.master.HMaster.isInMaintenanceMode(HMaster.java:2559)
> {noformat}
> The ZK trackers, including the maintenance mode tracker, are initialized only 
> after we try to bring up the filesystem. If HDFS is in safe mode and the 
> master is waiting on that, when an access to the master status page comes in 
> we trip over this problem. There might be other issues after we fix this, but 
> NPE Is always a bug, so let's address it. One option is to connect the ZK 
> based components with ZK before attempting to bring up the filesystem. Let me 
> try that first. If that doesn't work we could at least throw an IOE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20670) NPE in HMaster#isInMaintenanceMode

2018-06-01 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20670:
---
Fix Version/s: 2.0.1
   2.1.0
   3.0.0

> NPE in HMaster#isInMaintenanceMode
> --
>
> Key: HBASE-20670
> URL: https://issues.apache.org/jira/browse/HBASE-20670
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 2.0.1, 1.4.5
>
> Attachments: HBASE-20670-branch-1.patch, HBASE-20670.patch
>
>
> {noformat}
> Problem accessing /master-status. Reason: INTERNAL_SERVER_ERROR
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.master.HMaster.isInMaintenanceMode(HMaster.java:2559)
> {noformat}
> The ZK trackers, including the maintenance mode tracker, are initialized only 
> after we try to bring up the filesystem. If HDFS is in safe mode and the 
> master is waiting on that, when an access to the master status page comes in 
> we trip over this problem. There might be other issues after we fix this, but 
> NPE Is always a bug, so let's address it. One option is to connect the ZK 
> based components with ZK before attempting to bring up the filesystem. Let me 
> try that first. If that doesn't work we could at least throw an IOE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20673) Purge Cell Types; the profusion is distracting to users and JIT

2018-06-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498708#comment-16498708
 ] 

Hadoop QA commented on HBASE-20673:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} HBASE-20673 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.7.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-20673 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12926197/0001-current.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13056/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Purge Cell Types; the profusion is distracting to users and JIT
> ---
>
> Key: HBASE-20673
> URL: https://issues.apache.org/jira/browse/HBASE-20673
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 0001-current.patch, 0001-current.patch, hits.20673.png
>
>
> We have a wild blossom of Cell types in hbase. The Cell Interface has many 
> implementors. Purge the bulk of them. Make it so we do one type well. JIT 
> gets confused if it has an abstract or an interface and then the types are 
> myriad (megamorphic).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20673) Purge Cell Types; the profusion is distracting to users and JIT

2018-06-01 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498702#comment-16498702
 ] 

stack commented on HBASE-20673:
---

Patch tries to make BBKV our type in hbase2 (makes anonymous classes where can 
exploit an advantage when we know there are no tags). Removes types whose only 
reason-to-exist is that they parse chunkid. Thinking was that its rare we do 
the deserialization. Turns out, this needs more study as we seem to go slower 
with this patch in place, not faster.

 !hits.20673.png! 

> Purge Cell Types; the profusion is distracting to users and JIT
> ---
>
> Key: HBASE-20673
> URL: https://issues.apache.org/jira/browse/HBASE-20673
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 0001-current.patch, 0001-current.patch, hits.20673.png
>
>
> We have a wild blossom of Cell types in hbase. The Cell Interface has many 
> implementors. Purge the bulk of them. Make it so we do one type well. JIT 
> gets confused if it has an abstract or an interface and then the types are 
> myriad (megamorphic).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20628) SegmentScanner does over-comparing when one flushing

2018-06-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498701#comment-16498701
 ] 

Hadoop QA commented on HBASE-20628:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
55s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
40s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
11s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
10s{color} | {color:red} hbase-server: The patch generated 6 new + 5 unchanged 
- 0 fixed = 11 total (was 5) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
14s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 36s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}109m  
8s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}149m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:369877d |
| JIRA Issue | HBASE-20628 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12926158/HBASE-20628.branch-2.0.003.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux d09831391262 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / 512397220a |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13052/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 

[jira] [Updated] (HBASE-20673) Purge Cell Types; the profusion is distracting to users and JIT

2018-06-01 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20673:
--
Attachment: 0001-current.patch

> Purge Cell Types; the profusion is distracting to users and JIT
> ---
>
> Key: HBASE-20673
> URL: https://issues.apache.org/jira/browse/HBASE-20673
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 0001-current.patch, 0001-current.patch, hits.20673.png
>
>
> We have a wild blossom of Cell types in hbase. The Cell Interface has many 
> implementors. Purge the bulk of them. Make it so we do one type well. JIT 
> gets confused if it has an abstract or an interface and then the types are 
> myriad (megamorphic).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20673) Purge Cell Types; the profusion is distracting to users and JIT

2018-06-01 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20673:
--
Attachment: hits.20673.png

> Purge Cell Types; the profusion is distracting to users and JIT
> ---
>
> Key: HBASE-20673
> URL: https://issues.apache.org/jira/browse/HBASE-20673
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 0001-current.patch, 0001-current.patch, hits.20673.png
>
>
> We have a wild blossom of Cell types in hbase. The Cell Interface has many 
> implementors. Purge the bulk of them. Make it so we do one type well. JIT 
> gets confused if it has an abstract or an interface and then the types are 
> myriad (megamorphic).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20669) [findbugs] autoboxing to parse primitive

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498700#comment-16498700
 ] 

Hudson commented on HBASE-20669:


Results for branch branch-1.3
[build #348 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/348/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/348//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/348//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/348//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> [findbugs] autoboxing to parse primitive
> 
>
> Key: HBASE-20669
> URL: https://issues.apache.org/jira/browse/HBASE-20669
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2, 1.5.0, 1.2.7, 1.4.3
>Reporter: Sean Busbey
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.5
>
> Attachments: HBASE-20669.branch-1.002.patch
>
>
> HBASE-18864 introduced a change that's been flagged by findbugs. example on 
> branch-1:
> {quote}
> FindBugs  module:hbase-client
> Boxing/unboxing to parse a primitive 
> org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[], byte[]) At 
> HColumnDescriptor.java:org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[],
>  byte[]) At HColumnDescriptor.java:[line 579]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20664) Variable shared across multiple threads

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498699#comment-16498699
 ] 

Hudson commented on HBASE-20664:


Results for branch branch-1.3
[build #348 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/348/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/348//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/348//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/348//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Variable shared across multiple threads
> ---
>
> Key: HBASE-20664
> URL: https://issues.apache.org/jira/browse/HBASE-20664
> Project: HBase
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.4.5, 1.2.6.1, 1.3.2.1, 2.0.0.1
>
>
> Some static analysis found a variable which was used across multiple threads 
> without any synchronization that would allow race conditions.
> The variable does not need to be a member of the class, instead just made a 
> local variable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20634) Reopen region while server crash can cause the procedure to be stuck

2018-06-01 Thread Umesh Agashe (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498698#comment-16498698
 ] 

Umesh Agashe commented on HBASE-20634:
--

+1 lgtm

> Reopen region while server crash can cause the procedure to be stuck
> 
>
> Key: HBASE-20634
> URL: https://issues.apache.org/jira/browse/HBASE-20634
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>Assignee: stack
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20634-UT.patch, HBASE-20634.branch-2.0.001.patch, 
> HBASE-20634.branch-2.0.002.patch, HBASE-20634.branch-2.0.003.patch, 
> HBASE-20634.branch-2.0.004.patch, HBASE-20634.branch-2.0.005.patch, 
> HBASE-20634.branch-2.0.006.patch, HBASE-20634.branch-2.0.006.patch, 
> HBASE-20634.branch-2.0.007.patch
>
>
> Found this when implementing HBASE-20424, where we will transit the peer sync 
> replication state while there is server crash.
> The problem is that, in ServerCrashAssign, we do not have the region lock, so 
> it is possible that after we call handleRIT to clear the existing 
> assign/unassign procedures related to this rs, and before we schedule the 
> assign procedures, it is possible that that we schedule a unassign procedure 
> for a region on the crashed rs. This procedure will not receive the 
> ServerCrashException, instead, in addToRemoteDispatcher, it will find that it 
> can not dispatch the remote call and then a  FailedRemoteDispatchException 
> will be raised. But we do not treat this exception the same with 
> ServerCrashException, instead, we will try to expire the rs. Obviously the rs 
> has already been marked as expired, so this is almost a no-op. Then the 
> procedure will be stuck there for ever.
> A possible way to fix it is to treat FailedRemoteDispatchException the same 
> with ServerCrashException, as it will be created in addToRemoteDispatcher 
> only, and the only reason we can not dispatch a remote call is that the rs 
> has already been dead. The nodeMap is a ConcurrentMap so I think we could use 
> it as a guard.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20673) Purge Cell Types; the profusion is distracting to users and JIT

2018-06-01 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20673:
--
Assignee: stack
  Status: Patch Available  (was: Open)

> Purge Cell Types; the profusion is distracting to users and JIT
> ---
>
> Key: HBASE-20673
> URL: https://issues.apache.org/jira/browse/HBASE-20673
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 0001-current.patch
>
>
> We have a wild blossom of Cell types in hbase. The Cell Interface has many 
> implementors. Purge the bulk of them. Make it so we do one type well. JIT 
> gets confused if it has an abstract or an interface and then the types are 
> myriad (megamorphic).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20673) Purge Cell Types; the profusion is distracting to users and JIT

2018-06-01 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20673:
--
Attachment: 0001-current.patch

> Purge Cell Types; the profusion is distracting to users and JIT
> ---
>
> Key: HBASE-20673
> URL: https://issues.apache.org/jira/browse/HBASE-20673
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Priority: Major
> Attachments: 0001-current.patch
>
>
> We have a wild blossom of Cell types in hbase. The Cell Interface has many 
> implementors. Purge the bulk of them. Make it so we do one type well. JIT 
> gets confused if it has an abstract or an interface and then the types are 
> myriad (megamorphic).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20673) Purge Cell Types; the profusion is distracting to users and JIT

2018-06-01 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20673:
--
Issue Type: Sub-task  (was: Bug)
Parent: HBASE-20188

> Purge Cell Types; the profusion is distracting to users and JIT
> ---
>
> Key: HBASE-20673
> URL: https://issues.apache.org/jira/browse/HBASE-20673
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Priority: Major
>
> We have a wild blossom of Cell types in hbase. The Cell Interface has many 
> implementors. Purge the bulk of them. Make it so we do one type well. JIT 
> gets confused if it has an abstract or an interface and then the types are 
> myriad (megamorphic).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20673) Purge Cell Types; the profusion is distracting to users and JIT

2018-06-01 Thread stack (JIRA)
stack created HBASE-20673:
-

 Summary: Purge Cell Types; the profusion is distracting to users 
and JIT
 Key: HBASE-20673
 URL: https://issues.apache.org/jira/browse/HBASE-20673
 Project: HBase
  Issue Type: Bug
  Components: Performance
Reporter: stack


We have a wild blossom of Cell types in hbase. The Cell Interface has many 
implementors. Purge the bulk of them. Make it so we do one type well. JIT gets 
confused if it has an abstract or an interface and then the types are myriad 
(megamorphic).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20634) Reopen region while server crash can cause the procedure to be stuck

2018-06-01 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498692#comment-16498692
 ] 

stack commented on HBASE-20634:
---

.007 addresses nice review by [~uagashe]

> Reopen region while server crash can cause the procedure to be stuck
> 
>
> Key: HBASE-20634
> URL: https://issues.apache.org/jira/browse/HBASE-20634
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>Assignee: stack
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20634-UT.patch, HBASE-20634.branch-2.0.001.patch, 
> HBASE-20634.branch-2.0.002.patch, HBASE-20634.branch-2.0.003.patch, 
> HBASE-20634.branch-2.0.004.patch, HBASE-20634.branch-2.0.005.patch, 
> HBASE-20634.branch-2.0.006.patch, HBASE-20634.branch-2.0.006.patch, 
> HBASE-20634.branch-2.0.007.patch
>
>
> Found this when implementing HBASE-20424, where we will transit the peer sync 
> replication state while there is server crash.
> The problem is that, in ServerCrashAssign, we do not have the region lock, so 
> it is possible that after we call handleRIT to clear the existing 
> assign/unassign procedures related to this rs, and before we schedule the 
> assign procedures, it is possible that that we schedule a unassign procedure 
> for a region on the crashed rs. This procedure will not receive the 
> ServerCrashException, instead, in addToRemoteDispatcher, it will find that it 
> can not dispatch the remote call and then a  FailedRemoteDispatchException 
> will be raised. But we do not treat this exception the same with 
> ServerCrashException, instead, we will try to expire the rs. Obviously the rs 
> has already been marked as expired, so this is almost a no-op. Then the 
> procedure will be stuck there for ever.
> A possible way to fix it is to treat FailedRemoteDispatchException the same 
> with ServerCrashException, as it will be created in addToRemoteDispatcher 
> only, and the only reason we can not dispatch a remote call is that the rs 
> has already been dead. The nodeMap is a ConcurrentMap so I think we could use 
> it as a guard.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20634) Reopen region while server crash can cause the procedure to be stuck

2018-06-01 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20634:
--
Attachment: HBASE-20634.branch-2.0.007.patch

> Reopen region while server crash can cause the procedure to be stuck
> 
>
> Key: HBASE-20634
> URL: https://issues.apache.org/jira/browse/HBASE-20634
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>Assignee: stack
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20634-UT.patch, HBASE-20634.branch-2.0.001.patch, 
> HBASE-20634.branch-2.0.002.patch, HBASE-20634.branch-2.0.003.patch, 
> HBASE-20634.branch-2.0.004.patch, HBASE-20634.branch-2.0.005.patch, 
> HBASE-20634.branch-2.0.006.patch, HBASE-20634.branch-2.0.006.patch, 
> HBASE-20634.branch-2.0.007.patch
>
>
> Found this when implementing HBASE-20424, where we will transit the peer sync 
> replication state while there is server crash.
> The problem is that, in ServerCrashAssign, we do not have the region lock, so 
> it is possible that after we call handleRIT to clear the existing 
> assign/unassign procedures related to this rs, and before we schedule the 
> assign procedures, it is possible that that we schedule a unassign procedure 
> for a region on the crashed rs. This procedure will not receive the 
> ServerCrashException, instead, in addToRemoteDispatcher, it will find that it 
> can not dispatch the remote call and then a  FailedRemoteDispatchException 
> will be raised. But we do not treat this exception the same with 
> ServerCrashException, instead, we will try to expire the rs. Obviously the rs 
> has already been marked as expired, so this is almost a no-op. Then the 
> procedure will be stuck there for ever.
> A possible way to fix it is to treat FailedRemoteDispatchException the same 
> with ServerCrashException, as it will be created in addToRemoteDispatcher 
> only, and the only reason we can not dispatch a remote call is that the rs 
> has already been dead. The nodeMap is a ConcurrentMap so I think we could use 
> it as a guard.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20592) Create a tool to verify tables do not have prefix tree encoding

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498683#comment-16498683
 ] 

Hudson commented on HBASE-20592:


Results for branch branch-2.0
[build #376 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/376/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/376//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/376//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/376//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Create a tool to verify tables do not have prefix tree encoding
> ---
>
> Key: HBASE-20592
> URL: https://issues.apache.org/jira/browse/HBASE-20592
> Project: HBase
>  Issue Type: New Feature
>  Components: Operability, tooling
>Reporter: Peter Somogyi
>Assignee: Peter Somogyi
>Priority: Minor
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20592.master.001.patch, 
> HBASE-20592.master.002.patch, HBASE-20592.master.003.patch, 
> HBASE-20592.master.004.patch, HBASE-20592.master.005.patch, 
> HBASE-20592.master.006.patch
>
>
> HBase 2.0.0 removed PREFIX_TREE encoding so users need to modify data block 
> encoding to something else before upgrading to HBase 2.0+. A tool would help 
> users to verify that there are no tables left with PREFIX_TREE encoding.
> The tool needs to check the following:
>  * There are no tables where DATA_BLOCK_ENCODING => 'PREFIX_TREE'
>  * -Check existing hfiles that none of them have PREFIX_TREE encoding (in 
> case table description is changed but hfiles were not rewritten)-



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20667) Rename TestGlobalThrottler to TestReplicationGlobalThrottler

2018-06-01 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498682#comment-16498682
 ] 

Andrew Purtell commented on HBASE-20667:


I suppose so but some of us are still in the stone age with branch-1 (barely) :)

> Rename TestGlobalThrottler to TestReplicationGlobalThrottler
> 
>
> Key: HBASE-20667
> URL: https://issues.apache.org/jira/browse/HBASE-20667
> Project: HBase
>  Issue Type: Test
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 2.1.0, 1.5.0
>
> Attachments: HBASE-20667.patch
>
>
> If running replication unit tests , perhaps like  {{mvn test 
> -Dtest=TestReplication\*,Test\*Replication\*}} , then you will miss 
> TestGlobalThrottler. This should be renamed to TestReplicationGlobalThrottler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20667) Rename TestGlobalThrottler to TestReplicationGlobalThrottler

2018-06-01 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498672#comment-16498672
 ] 

Mike Drob commented on HBASE-20667:
---

No concerns here, bit isn't the more sustainable way to use junit categories?

> Rename TestGlobalThrottler to TestReplicationGlobalThrottler
> 
>
> Key: HBASE-20667
> URL: https://issues.apache.org/jira/browse/HBASE-20667
> Project: HBase
>  Issue Type: Test
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 2.1.0, 1.5.0
>
> Attachments: HBASE-20667.patch
>
>
> If running replication unit tests , perhaps like  {{mvn test 
> -Dtest=TestReplication\*,Test\*Replication\*}} , then you will miss 
> TestGlobalThrottler. This should be renamed to TestReplicationGlobalThrottler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20671) Merged region brought back to life causing RS to be killed by Master

2018-06-01 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498661#comment-16498661
 ] 

stack commented on HBASE-20671:
---

Seems related to HBASE-19529. I can take it if you want [~elserj]


> Merged region brought back to life causing RS to be killed by Master
> 
>
> Key: HBASE-20671
> URL: https://issues.apache.org/jira/browse/HBASE-20671
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.0.0
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Critical
> Attachments: 
> hbase-hbase-master-ctr-e138-1518143905142-336066-01-03.hwx.site.log.zip, 
> hbase-hbase-regionserver-ctr-e138-1518143905142-336066-01-02.hwx.site.log.zip
>
>
> Another bug coming out of a master restart and replay of the pv2 logs.
> The master merged two regions into one successfully, was restarted, but then 
> ended up assigning the children region back out to the cluster. There is a 
> log message which appears to indicate that RegionStates acknowledges that it 
> doesn't know what this region is as it's replaying the pv2 WAL; however, it 
> incorrectly assumes that the region is just OFFLINE and needs to be assigned.
> {noformat}
> 2018-05-30 04:26:00,055 INFO  
> [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] master.HMaster: 
> Client=hrt_qa//172.27.85.11 Merge regions a7dd6606dcacc9daf085fc9fa2aecc0c 
> and 4017a3c778551d4d258c785d455f9c0b
> 2018-05-30 04:28:27,525 DEBUG 
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> procedure2.ProcedureExecutor: Completed pid=4368, state=SUCCESS; 
> MergeTableRegionsProcedure table=tabletwo_merge, 
> regions=[a7dd6606dcacc9daf085fc9fa2aecc0c, 4017a3c778551d4d258c785d455f9c0b], 
> forcibly=false
> {noformat}
> {noformat}
> 2018-05-30 04:29:20,263 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.AssignmentManager: a7dd6606dcacc9daf085fc9fa2aecc0c 
> regionState=null; presuming OFFLINE
> 2018-05-30 04:29:20,263 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!! 
> rit=OFFLINE, location=null, table=tabletwo_merge, 
> region=a7dd6606dcacc9daf085fc9fa2aecc0c
> 2018-05-30 04:29:20,266 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.AssignmentManager: 4017a3c778551d4d258c785d455f9c0b 
> regionState=null; presuming OFFLINE
> 2018-05-30 04:29:20,266 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!! 
> rit=OFFLINE, location=null, table=tabletwo_merge, 
> region=4017a3c778551d4d258c785d455f9c0b
> {noformat}
> Eventually, the RS reports in its online regions, and the master tells it to 
> kill itself:
> {noformat}
> 2018-05-30 04:29:24,272 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=26,queue=2,port=2] 
> assignment.AssignmentManager: Killing 
> ctr-e138-1518143905142-336066-01-02.hwx.site,16020,1527654546619: Not 
> online: tabletwo_merge,,1527652130538.a7dd6606dcacc9daf085fc9fa2aecc0c.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20331) clean up shaded packaging for 2.1

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498648#comment-16498648
 ] 

Hudson commented on HBASE-20331:


Results for branch HBASE-20331
[build #27 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/27/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/27//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/27//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/27//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 2. [see log for 
details|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/27//artifacts/output-integration/hadoop-2.log].
 (note that this means we didn't run on Hadoop 3)


> clean up shaded packaging for 2.1
> -
>
> Key: HBASE-20331
> URL: https://issues.apache.org/jira/browse/HBASE-20331
> Project: HBase
>  Issue Type: Umbrella
>  Components: Client, mapreduce, shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
>
> polishing pass on shaded modules for 2.0 based on trying to use them in more 
> contexts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20592) Create a tool to verify tables do not have prefix tree encoding

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498644#comment-16498644
 ] 

Hudson commented on HBASE-20592:


Results for branch branch-2
[build #811 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/811/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/811//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/811//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/811//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Create a tool to verify tables do not have prefix tree encoding
> ---
>
> Key: HBASE-20592
> URL: https://issues.apache.org/jira/browse/HBASE-20592
> Project: HBase
>  Issue Type: New Feature
>  Components: Operability, tooling
>Reporter: Peter Somogyi
>Assignee: Peter Somogyi
>Priority: Minor
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20592.master.001.patch, 
> HBASE-20592.master.002.patch, HBASE-20592.master.003.patch, 
> HBASE-20592.master.004.patch, HBASE-20592.master.005.patch, 
> HBASE-20592.master.006.patch
>
>
> HBase 2.0.0 removed PREFIX_TREE encoding so users need to modify data block 
> encoding to something else before upgrading to HBase 2.0+. A tool would help 
> users to verify that there are no tables left with PREFIX_TREE encoding.
> The tool needs to check the following:
>  * There are no tables where DATA_BLOCK_ENCODING => 'PREFIX_TREE'
>  * -Check existing hfiles that none of them have PREFIX_TREE encoding (in 
> case table description is changed but hfiles were not rewritten)-



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20664) Variable shared across multiple threads

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498642#comment-16498642
 ] 

Hudson commented on HBASE-20664:


Results for branch branch-1.2
[build #352 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/352/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/352//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/352//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/352//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Variable shared across multiple threads
> ---
>
> Key: HBASE-20664
> URL: https://issues.apache.org/jira/browse/HBASE-20664
> Project: HBase
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.4.5, 1.2.6.1, 1.3.2.1, 2.0.0.1
>
>
> Some static analysis found a variable which was used across multiple threads 
> without any synchronization that would allow race conditions.
> The variable does not need to be a member of the class, instead just made a 
> local variable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20670) NPE in HMaster#isInMaintenanceMode

2018-06-01 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498626#comment-16498626
 ] 

Andrew Purtell commented on HBASE-20670:


Moving initialization of ZK trackers up ahead of the filesystem opens a can of 
worms, no go there. Probably will just throw a PleaseHoldException from 
HMaster.isInMaintenanceMode. So, some minor adjustments to MasterServices will 
be needed where in a couple of places we will now have checked exceptions. It 
is a Private annotated interface so no problem. 

> NPE in HMaster#isInMaintenanceMode
> --
>
> Key: HBASE-20670
> URL: https://issues.apache.org/jira/browse/HBASE-20670
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 1.5.0, 1.3.3, 1.4.5
>
>
> {noformat}
> Problem accessing /master-status. Reason: INTERNAL_SERVER_ERROR
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.master.HMaster.isInMaintenanceMode(HMaster.java:2559)
> {noformat}
> The ZK trackers, including the maintenance mode tracker, are initialized only 
> after we try to bring up the filesystem. If HDFS is in safe mode and the 
> master is waiting on that, when an access to the master status page comes in 
> we trip over this problem. There might be other issues after we fix this, but 
> NPE Is always a bug, so let's address it. One option is to connect the ZK 
> based components with ZK before attempting to bring up the filesystem. Let me 
> try that first. If that doesn't work we could at least throw an IOE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20634) Reopen region while server crash can cause the procedure to be stuck

2018-06-01 Thread Umesh Agashe (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498618#comment-16498618
 ] 

Umesh Agashe commented on HBASE-20634:
--

[~stack], I have posted comments on RB for latest patch. thanks!

> Reopen region while server crash can cause the procedure to be stuck
> 
>
> Key: HBASE-20634
> URL: https://issues.apache.org/jira/browse/HBASE-20634
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>Assignee: stack
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20634-UT.patch, HBASE-20634.branch-2.0.001.patch, 
> HBASE-20634.branch-2.0.002.patch, HBASE-20634.branch-2.0.003.patch, 
> HBASE-20634.branch-2.0.004.patch, HBASE-20634.branch-2.0.005.patch, 
> HBASE-20634.branch-2.0.006.patch, HBASE-20634.branch-2.0.006.patch
>
>
> Found this when implementing HBASE-20424, where we will transit the peer sync 
> replication state while there is server crash.
> The problem is that, in ServerCrashAssign, we do not have the region lock, so 
> it is possible that after we call handleRIT to clear the existing 
> assign/unassign procedures related to this rs, and before we schedule the 
> assign procedures, it is possible that that we schedule a unassign procedure 
> for a region on the crashed rs. This procedure will not receive the 
> ServerCrashException, instead, in addToRemoteDispatcher, it will find that it 
> can not dispatch the remote call and then a  FailedRemoteDispatchException 
> will be raised. But we do not treat this exception the same with 
> ServerCrashException, instead, we will try to expire the rs. Obviously the rs 
> has already been marked as expired, so this is almost a no-op. Then the 
> procedure will be stuck there for ever.
> A possible way to fix it is to treat FailedRemoteDispatchException the same 
> with ServerCrashException, as it will be created in addToRemoteDispatcher 
> only, and the only reason we can not dispatch a remote call is that the rs 
> has already been dead. The nodeMap is a ConcurrentMap so I think we could use 
> it as a guard.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20668) Avoid permission change if ExportSnapshot's copy fails

2018-06-01 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20668:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.1.0
   Status: Resolved  (was: Patch Available)

Thanks for the review, Josh

> Avoid permission change if ExportSnapshot's copy fails
> --
>
> Key: HBASE-20668
> URL: https://issues.apache.org/jira/browse/HBASE-20668
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 2.1.0
>
> Attachments: 20668.v1.txt, 20668.v2.txt
>
>
> The exception from FileSystem operation in the finally block of 
> ExportSnapshot#doWork may hide the real exception from FileUtil.copy call.
> I was debugging the following error [~romil.choksi] saw during testing 
> ExportSnapshot :
> {code:java}
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|2018-06-01 02:40:52,358 
> ERROR [main] util.AbstractHBaseTool: Error  running command-line tool
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|java.io.FileNotFoundException:
>  Directory/File does not exist /apps/ 
> hbase/data/.hbase-snapshot/.tmp/snapshot_table_334546
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.  
> checkOwner(FSDirectory.java:1777)
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.  
> setOwner(FSDirAttrOp.java:82)
> {code}
> Here is corresponding code (with extra log added):
> {code:java}
> try {
>   LOG.info("Copy Snapshot Manifest from " + snapshotDir + " to " + 
> initialOutputSnapshotDir);
>   boolean ret = FileUtil.copy(inputFs, snapshotDir, outputFs, 
> initialOutputSnapshotDir, false,
>   false, conf);
>   LOG.info("return val = " + ret);
> } catch (IOException e) {
>   LOG.warn("Failed to copy the snapshot directory: from=" +
>   snapshotDir + " to=" + initialOutputSnapshotDir, e);
>   throw new ExportSnapshotException("Failed to copy the snapshot 
> directory: from=" +
> snapshotDir + " to=" + initialOutputSnapshotDir, e);
> } finally {
>   if (filesUser != null || filesGroup != null) {
> LOG.warn((filesUser == null ? "" : "Change the owner of " + 
> needSetOwnerDir + " to "
> + filesUser)
> + (filesGroup == null ? "" : ", Change the group of " + 
> needSetOwnerDir + " to "
> + filesGroup));
> setOwner(outputFs, needSetOwnerDir, filesUser, filesGroup, true);
>   }
> {code}
> "return val = " was not seen in rerun of the test.
>  This is what the additional log revealed:
> {code:java}
> 2018-06-01 09:22:54,247|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|2018-06-01 09:22:54,241 WARN 
>  [main] snapshot.ExportSnapshot: Failed to copy the snapshot directory: 
> from=hdfs://ns1/apps/hbase/data/.hbase-snapshot/snapshot_table_157842 
> to=hdfs://ns3/apps/hbase/data/.hbase-snapshot/.tmp/snapshot_table_157842
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|org.apache.hadoop.security.AccessControlException:
>  Permission denied:   user=hbase, access=WRITE, 
> inode="/apps/hbase/data/.hbase-snapshot/.tmp":hrt_qa:hadoop:drx-wT
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> check(FSPermissionChecker.java:399)
> 2018-06-01 09:22:54,249|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> checkPermission(FSPermissionChecker.java:255)
> {code}
> It turned out that the exception from {{setOwner}} call in the finally block 
> eclipsed the real exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20672) Create new HBase metrics ReadRequestRate and WriteRequestRate that reset at every monitoring interval

2018-06-01 Thread Ankit Jain (JIRA)
Ankit Jain created HBASE-20672:
--

 Summary: Create new HBase metrics ReadRequestRate and 
WriteRequestRate that reset at every monitoring interval
 Key: HBASE-20672
 URL: https://issues.apache.org/jira/browse/HBASE-20672
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Reporter: Ankit Jain
Assignee: Ankit Jain


Hbase currently provides counter read/write requests (ReadRequestCount, 
WriteRequestCount). That said it is not easy to use counter that reset only 
after a restart of the service, we would like to expose 2 new metrics in HBase 
to provide ReadRequestRate and WriteRequestRate at region server level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20667) Rename TestGlobalThrottler to TestReplicationGlobalThrottler

2018-06-01 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498576#comment-16498576
 ] 

Sean Busbey commented on HBASE-20667:
-

+1

> Rename TestGlobalThrottler to TestReplicationGlobalThrottler
> 
>
> Key: HBASE-20667
> URL: https://issues.apache.org/jira/browse/HBASE-20667
> Project: HBase
>  Issue Type: Test
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 2.1.0, 1.5.0
>
> Attachments: HBASE-20667.patch
>
>
> If running replication unit tests , perhaps like  {{mvn test 
> -Dtest=TestReplication\*,Test\*Replication\*}} , then you will miss 
> TestGlobalThrottler. This should be renamed to TestReplicationGlobalThrottler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20667) Rename TestGlobalThrottler to TestReplicationGlobalThrottler

2018-06-01 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498573#comment-16498573
 ] 

Andrew Purtell commented on HBASE-20667:


If there are no concerns about this I am just going to go ahead and do it. Will 
wait a few hours before proceeding.

> Rename TestGlobalThrottler to TestReplicationGlobalThrottler
> 
>
> Key: HBASE-20667
> URL: https://issues.apache.org/jira/browse/HBASE-20667
> Project: HBase
>  Issue Type: Test
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 2.1.0, 1.5.0
>
> Attachments: HBASE-20667.patch
>
>
> If running replication unit tests , perhaps like  {{mvn test 
> -Dtest=TestReplication\*,Test\*Replication\*}} , then you will miss 
> TestGlobalThrottler. This should be renamed to TestReplicationGlobalThrottler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20671) Merged region brought back to life causing RS to be killed by Master

2018-06-01 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498572#comment-16498572
 ] 

Josh Elser commented on HBASE-20671:


Broad FYI for folks I know looking at similar things already: [~stack], 
[~Apache9], [~yuzhih...@gmail.com]

It seems to me that, while replaying the pv2 WAL, we need to somehow determine 
when a region has been deleted. I'd guess that this means ripping through the 
entire log before taking action, but I'm not sure if that's practical... Maybe 
I need to try to write a test first?

> Merged region brought back to life causing RS to be killed by Master
> 
>
> Key: HBASE-20671
> URL: https://issues.apache.org/jira/browse/HBASE-20671
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.0.0
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Critical
> Attachments: 
> hbase-hbase-master-ctr-e138-1518143905142-336066-01-03.hwx.site.log.zip, 
> hbase-hbase-regionserver-ctr-e138-1518143905142-336066-01-02.hwx.site.log.zip
>
>
> Another bug coming out of a master restart and replay of the pv2 logs.
> The master merged two regions into one successfully, was restarted, but then 
> ended up assigning the children region back out to the cluster. There is a 
> log message which appears to indicate that RegionStates acknowledges that it 
> doesn't know what this region is as it's replaying the pv2 WAL; however, it 
> incorrectly assumes that the region is just OFFLINE and needs to be assigned.
> {noformat}
> 2018-05-30 04:26:00,055 INFO  
> [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] master.HMaster: 
> Client=hrt_qa//172.27.85.11 Merge regions a7dd6606dcacc9daf085fc9fa2aecc0c 
> and 4017a3c778551d4d258c785d455f9c0b
> 2018-05-30 04:28:27,525 DEBUG 
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> procedure2.ProcedureExecutor: Completed pid=4368, state=SUCCESS; 
> MergeTableRegionsProcedure table=tabletwo_merge, 
> regions=[a7dd6606dcacc9daf085fc9fa2aecc0c, 4017a3c778551d4d258c785d455f9c0b], 
> forcibly=false
> {noformat}
> {noformat}
> 2018-05-30 04:29:20,263 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.AssignmentManager: a7dd6606dcacc9daf085fc9fa2aecc0c 
> regionState=null; presuming OFFLINE
> 2018-05-30 04:29:20,263 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!! 
> rit=OFFLINE, location=null, table=tabletwo_merge, 
> region=a7dd6606dcacc9daf085fc9fa2aecc0c
> 2018-05-30 04:29:20,266 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.AssignmentManager: 4017a3c778551d4d258c785d455f9c0b 
> regionState=null; presuming OFFLINE
> 2018-05-30 04:29:20,266 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!! 
> rit=OFFLINE, location=null, table=tabletwo_merge, 
> region=4017a3c778551d4d258c785d455f9c0b
> {noformat}
> Eventually, the RS reports in its online regions, and the master tells it to 
> kill itself:
> {noformat}
> 2018-05-30 04:29:24,272 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=26,queue=2,port=2] 
> assignment.AssignmentManager: Killing 
> ctr-e138-1518143905142-336066-01-02.hwx.site,16020,1527654546619: Not 
> online: tabletwo_merge,,1527652130538.a7dd6606dcacc9daf085fc9fa2aecc0c.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20671) Merged region brought back to life causing RS to be killed by Master

2018-06-01 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-20671:
---
Attachment: 
hbase-hbase-regionserver-ctr-e138-1518143905142-336066-01-02.hwx.site.log.zip

hbase-hbase-master-ctr-e138-1518143905142-336066-01-03.hwx.site.log.zip

> Merged region brought back to life causing RS to be killed by Master
> 
>
> Key: HBASE-20671
> URL: https://issues.apache.org/jira/browse/HBASE-20671
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.0.0
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Critical
> Attachments: 
> hbase-hbase-master-ctr-e138-1518143905142-336066-01-03.hwx.site.log.zip, 
> hbase-hbase-regionserver-ctr-e138-1518143905142-336066-01-02.hwx.site.log.zip
>
>
> Another bug coming out of a master restart and replay of the pv2 logs.
> The master merged two regions into one successfully, was restarted, but then 
> ended up assigning the children region back out to the cluster. There is a 
> log message which appears to indicate that RegionStates acknowledges that it 
> doesn't know what this region is as it's replaying the pv2 WAL; however, it 
> incorrectly assumes that the region is just OFFLINE and needs to be assigned.
> {noformat}
> 2018-05-30 04:26:00,055 INFO  
> [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] master.HMaster: 
> Client=hrt_qa//172.27.85.11 Merge regions a7dd6606dcacc9daf085fc9fa2aecc0c 
> and 4017a3c778551d4d258c785d455f9c0b
> 2018-05-30 04:28:27,525 DEBUG 
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> procedure2.ProcedureExecutor: Completed pid=4368, state=SUCCESS; 
> MergeTableRegionsProcedure table=tabletwo_merge, 
> regions=[a7dd6606dcacc9daf085fc9fa2aecc0c, 4017a3c778551d4d258c785d455f9c0b], 
> forcibly=false
> {noformat}
> {noformat}
> 2018-05-30 04:29:20,263 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.AssignmentManager: a7dd6606dcacc9daf085fc9fa2aecc0c 
> regionState=null; presuming OFFLINE
> 2018-05-30 04:29:20,263 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!! 
> rit=OFFLINE, location=null, table=tabletwo_merge, 
> region=a7dd6606dcacc9daf085fc9fa2aecc0c
> 2018-05-30 04:29:20,266 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.AssignmentManager: 4017a3c778551d4d258c785d455f9c0b 
> regionState=null; presuming OFFLINE
> 2018-05-30 04:29:20,266 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!! 
> rit=OFFLINE, location=null, table=tabletwo_merge, 
> region=4017a3c778551d4d258c785d455f9c0b
> {noformat}
> Eventually, the RS reports in its online regions, and the master tells it to 
> kill itself:
> {noformat}
> 2018-05-30 04:29:24,272 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=26,queue=2,port=2] 
> assignment.AssignmentManager: Killing 
> ctr-e138-1518143905142-336066-01-02.hwx.site,16020,1527654546619: Not 
> online: tabletwo_merge,,1527652130538.a7dd6606dcacc9daf085fc9fa2aecc0c.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20671) Merged region brought back to life causing RS to be killed by Master

2018-06-01 Thread Josh Elser (JIRA)
Josh Elser created HBASE-20671:
--

 Summary: Merged region brought back to life causing RS to be 
killed by Master
 Key: HBASE-20671
 URL: https://issues.apache.org/jira/browse/HBASE-20671
 Project: HBase
  Issue Type: Bug
  Components: amv2
Affects Versions: 2.0.0
Reporter: Josh Elser
Assignee: Josh Elser


Another bug coming out of a master restart and replay of the pv2 logs.

The master merged two regions into one successfully, was restarted, but then 
ended up assigning the children region back out to the cluster. There is a log 
message which appears to indicate that RegionStates acknowledges that it 
doesn't know what this region is as it's replaying the pv2 WAL; however, it 
incorrectly assumes that the region is just OFFLINE and needs to be assigned.
{noformat}
2018-05-30 04:26:00,055 INFO  
[RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] master.HMaster: 
Client=hrt_qa//172.27.85.11 Merge regions a7dd6606dcacc9daf085fc9fa2aecc0c and 
4017a3c778551d4d258c785d455f9c0b
2018-05-30 04:28:27,525 DEBUG 
[master/ctr-e138-1518143905142-336066-01-03:2] 
procedure2.ProcedureExecutor: Completed pid=4368, state=SUCCESS; 
MergeTableRegionsProcedure table=tabletwo_merge, 
regions=[a7dd6606dcacc9daf085fc9fa2aecc0c, 4017a3c778551d4d258c785d455f9c0b], 
forcibly=false
{noformat}
{noformat}
2018-05-30 04:29:20,263 INFO  
[master/ctr-e138-1518143905142-336066-01-03:2] 
assignment.AssignmentManager: a7dd6606dcacc9daf085fc9fa2aecc0c 
regionState=null; presuming OFFLINE
2018-05-30 04:29:20,263 INFO  
[master/ctr-e138-1518143905142-336066-01-03:2] assignment.RegionStates: 
Added to offline, CURRENTLY NEVER CLEARED!!! rit=OFFLINE, location=null, 
table=tabletwo_merge, region=a7dd6606dcacc9daf085fc9fa2aecc0c
2018-05-30 04:29:20,266 INFO  
[master/ctr-e138-1518143905142-336066-01-03:2] 
assignment.AssignmentManager: 4017a3c778551d4d258c785d455f9c0b 
regionState=null; presuming OFFLINE
2018-05-30 04:29:20,266 INFO  
[master/ctr-e138-1518143905142-336066-01-03:2] assignment.RegionStates: 
Added to offline, CURRENTLY NEVER CLEARED!!! rit=OFFLINE, location=null, 
table=tabletwo_merge, region=4017a3c778551d4d258c785d455f9c0b
{noformat}
Eventually, the RS reports in its online regions, and the master tells it to 
kill itself:
{noformat}
2018-05-30 04:29:24,272 WARN  
[RpcServer.default.FPBQ.Fifo.handler=26,queue=2,port=2] 
assignment.AssignmentManager: Killing 
ctr-e138-1518143905142-336066-01-02.hwx.site,16020,1527654546619: Not 
online: tabletwo_merge,,1527652130538.a7dd6606dcacc9daf085fc9fa2aecc0c.
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20628) SegmentScanner does over-comparing when one flushing

2018-06-01 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498569#comment-16498569
 ] 

stack commented on HBASE-20628:
---

Here is a run w/o patch and then with the patch.

 !hits.003.png! 

> SegmentScanner does over-comparing when one flushing
> 
>
> Key: HBASE-20628
> URL: https://issues.apache.org/jira/browse/HBASE-20628
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Priority: Critical
> Fix For: 2.0.1
>
> Attachments: HBASE-20628.branch-2.0.001 (1).patch, 
> HBASE-20628.branch-2.0.001.patch, HBASE-20628.branch-2.0.001.patch, 
> HBASE-20628.branch-2.0.002.patch, HBASE-20628.branch-2.0.003.patch, Screen 
> Shot 2018-05-25 at 9.38.00 AM.png, hits-20628.png, hits.003.png
>
>
> Flushing memstore is taking too long. It looks like we are doing a bunch of 
> comparing out of a new facility in hbase2, the Segment scanner at flush time.
> Below is a patch from [~anoop.hbase]. I had a similar more hacky version. 
> Both undo the extra comparing we were seeing in perf tests.
> [~anastas] and [~eshcar]. Need your help please.
> As I read it, we are trying to flush the memstore snapshot (default, no IMC 
> case). There is only ever going to be one Segment involved (even if IMC is 
> enabled); the snapshot Segment. But the getScanners is returning a list (of 
> one)  Scanners and the scan is via the generic SegmentScanner which is all 
> about a bunch of stuff we don't need when doing a flush so it seems to do 
> more work than is necessary. It also supports scanning backwards which is not 
> needed when trying to flush memstore.
> Do you see a problem doing a version of Anoops patch (whether IMC or not)? It 
> makes a big difference in general throughput when the below patch is in 
> place. Thanks.
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> index cbd60e5da3..c3dd972254 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> @@ -40,7 +40,8 @@ public class MemStoreSnapshot implements Closeable {
>  this.cellsCount = snapshot.getCellsCount();
>  this.memStoreSize = snapshot.getMemStoreSize();
>  this.timeRangeTracker = snapshot.getTimeRangeTracker();
> -this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE);
> +//this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE);
> +this.scanners = snapshot.getScannersForSnapshot();
>  this.tagsPresent = snapshot.isTagsPresent();
>}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> index 70074bf3b4..279c4e50c8 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> @@ -33,6 +33,7 @@ import org.apache.hadoop.hbase.KeyValueUtil;
>  import org.apache.hadoop.hbase.io.TimeRange;
>  import org.apache.hadoop.hbase.util.Bytes;
>  import org.apache.hadoop.hbase.util.ClassSize;
> +import org.apache.hadoop.hbase.util.CollectionBackedScanner;
>  import org.apache.yetus.audience.InterfaceAudience;
>  import org.slf4j.Logger;
>  import 
> org.apache.hbase.thirdparty.com.google.common.annotations.VisibleForTesting;
> @@ -130,6 +131,10 @@ public abstract class Segment {
>  return Collections.singletonList(new SegmentScanner(this, readPoint, 
> order));
>}
> +  public List getScannersForSnapshot() {
> +return Collections.singletonList(new 
> CollectionBackedScanner(this.cellSet.get(), comparator));
> +  }
> +
>/**
> * @return whether the segment has any cells
> */
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20628) SegmentScanner does over-comparing when one flushing

2018-06-01 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20628:
--
Attachment: hits.003.png

> SegmentScanner does over-comparing when one flushing
> 
>
> Key: HBASE-20628
> URL: https://issues.apache.org/jira/browse/HBASE-20628
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Priority: Critical
> Fix For: 2.0.1
>
> Attachments: HBASE-20628.branch-2.0.001 (1).patch, 
> HBASE-20628.branch-2.0.001.patch, HBASE-20628.branch-2.0.001.patch, 
> HBASE-20628.branch-2.0.002.patch, HBASE-20628.branch-2.0.003.patch, Screen 
> Shot 2018-05-25 at 9.38.00 AM.png, hits-20628.png, hits.003.png
>
>
> Flushing memstore is taking too long. It looks like we are doing a bunch of 
> comparing out of a new facility in hbase2, the Segment scanner at flush time.
> Below is a patch from [~anoop.hbase]. I had a similar more hacky version. 
> Both undo the extra comparing we were seeing in perf tests.
> [~anastas] and [~eshcar]. Need your help please.
> As I read it, we are trying to flush the memstore snapshot (default, no IMC 
> case). There is only ever going to be one Segment involved (even if IMC is 
> enabled); the snapshot Segment. But the getScanners is returning a list (of 
> one)  Scanners and the scan is via the generic SegmentScanner which is all 
> about a bunch of stuff we don't need when doing a flush so it seems to do 
> more work than is necessary. It also supports scanning backwards which is not 
> needed when trying to flush memstore.
> Do you see a problem doing a version of Anoops patch (whether IMC or not)? It 
> makes a big difference in general throughput when the below patch is in 
> place. Thanks.
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> index cbd60e5da3..c3dd972254 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> @@ -40,7 +40,8 @@ public class MemStoreSnapshot implements Closeable {
>  this.cellsCount = snapshot.getCellsCount();
>  this.memStoreSize = snapshot.getMemStoreSize();
>  this.timeRangeTracker = snapshot.getTimeRangeTracker();
> -this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE);
> +//this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE);
> +this.scanners = snapshot.getScannersForSnapshot();
>  this.tagsPresent = snapshot.isTagsPresent();
>}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> index 70074bf3b4..279c4e50c8 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> @@ -33,6 +33,7 @@ import org.apache.hadoop.hbase.KeyValueUtil;
>  import org.apache.hadoop.hbase.io.TimeRange;
>  import org.apache.hadoop.hbase.util.Bytes;
>  import org.apache.hadoop.hbase.util.ClassSize;
> +import org.apache.hadoop.hbase.util.CollectionBackedScanner;
>  import org.apache.yetus.audience.InterfaceAudience;
>  import org.slf4j.Logger;
>  import 
> org.apache.hbase.thirdparty.com.google.common.annotations.VisibleForTesting;
> @@ -130,6 +131,10 @@ public abstract class Segment {
>  return Collections.singletonList(new SegmentScanner(this, readPoint, 
> order));
>}
> +  public List getScannersForSnapshot() {
> +return Collections.singletonList(new 
> CollectionBackedScanner(this.cellSet.get(), comparator));
> +  }
> +
>/**
> * @return whether the segment has any cells
> */
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20634) Reopen region while server crash can cause the procedure to be stuck

2018-06-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498549#comment-16498549
 ] 

Hadoop QA commented on HBASE-20634:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
40s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
58s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
31s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 8s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
20s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 13m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 13m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
10s{color} | {color:red} hbase-server: The patch generated 6 new + 108 
unchanged - 2 fixed = 114 total (was 110) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 5s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 11s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
28s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
40s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}110m 
53s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
54s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}186m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:369877d |
| JIRA Issue | HBASE-20634 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12926125/HBASE-20634.branch-2.0.006.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  

[jira] [Commented] (HBASE-20628) SegmentScanner does over-comparing when one flushing

2018-06-01 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498541#comment-16498541
 ] 

stack commented on HBASE-20628:
---

.003 is a revamp to align with what [~eshcar] suggests up on rb (and higher up 
here) adding a getSnapshotScanner to ImmutableSegment. This seems safest way to 
proceed. Did all you suggested [~eshcar] except the close flag checking. If 
important can pull in in a follow-on.

> SegmentScanner does over-comparing when one flushing
> 
>
> Key: HBASE-20628
> URL: https://issues.apache.org/jira/browse/HBASE-20628
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Priority: Critical
> Fix For: 2.0.1
>
> Attachments: HBASE-20628.branch-2.0.001 (1).patch, 
> HBASE-20628.branch-2.0.001.patch, HBASE-20628.branch-2.0.001.patch, 
> HBASE-20628.branch-2.0.002.patch, HBASE-20628.branch-2.0.003.patch, Screen 
> Shot 2018-05-25 at 9.38.00 AM.png, hits-20628.png
>
>
> Flushing memstore is taking too long. It looks like we are doing a bunch of 
> comparing out of a new facility in hbase2, the Segment scanner at flush time.
> Below is a patch from [~anoop.hbase]. I had a similar more hacky version. 
> Both undo the extra comparing we were seeing in perf tests.
> [~anastas] and [~eshcar]. Need your help please.
> As I read it, we are trying to flush the memstore snapshot (default, no IMC 
> case). There is only ever going to be one Segment involved (even if IMC is 
> enabled); the snapshot Segment. But the getScanners is returning a list (of 
> one)  Scanners and the scan is via the generic SegmentScanner which is all 
> about a bunch of stuff we don't need when doing a flush so it seems to do 
> more work than is necessary. It also supports scanning backwards which is not 
> needed when trying to flush memstore.
> Do you see a problem doing a version of Anoops patch (whether IMC or not)? It 
> makes a big difference in general throughput when the below patch is in 
> place. Thanks.
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> index cbd60e5da3..c3dd972254 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> @@ -40,7 +40,8 @@ public class MemStoreSnapshot implements Closeable {
>  this.cellsCount = snapshot.getCellsCount();
>  this.memStoreSize = snapshot.getMemStoreSize();
>  this.timeRangeTracker = snapshot.getTimeRangeTracker();
> -this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE);
> +//this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE);
> +this.scanners = snapshot.getScannersForSnapshot();
>  this.tagsPresent = snapshot.isTagsPresent();
>}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> index 70074bf3b4..279c4e50c8 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> @@ -33,6 +33,7 @@ import org.apache.hadoop.hbase.KeyValueUtil;
>  import org.apache.hadoop.hbase.io.TimeRange;
>  import org.apache.hadoop.hbase.util.Bytes;
>  import org.apache.hadoop.hbase.util.ClassSize;
> +import org.apache.hadoop.hbase.util.CollectionBackedScanner;
>  import org.apache.yetus.audience.InterfaceAudience;
>  import org.slf4j.Logger;
>  import 
> org.apache.hbase.thirdparty.com.google.common.annotations.VisibleForTesting;
> @@ -130,6 +131,10 @@ public abstract class Segment {
>  return Collections.singletonList(new SegmentScanner(this, readPoint, 
> order));
>}
> +  public List getScannersForSnapshot() {
> +return Collections.singletonList(new 
> CollectionBackedScanner(this.cellSet.get(), comparator));
> +  }
> +
>/**
> * @return whether the segment has any cells
> */
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20628) SegmentScanner does over-comparing when one flushing

2018-06-01 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20628:
--
Attachment: HBASE-20628.branch-2.0.003.patch

> SegmentScanner does over-comparing when one flushing
> 
>
> Key: HBASE-20628
> URL: https://issues.apache.org/jira/browse/HBASE-20628
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Priority: Critical
> Fix For: 2.0.1
>
> Attachments: HBASE-20628.branch-2.0.001 (1).patch, 
> HBASE-20628.branch-2.0.001.patch, HBASE-20628.branch-2.0.001.patch, 
> HBASE-20628.branch-2.0.002.patch, HBASE-20628.branch-2.0.003.patch, Screen 
> Shot 2018-05-25 at 9.38.00 AM.png, hits-20628.png
>
>
> Flushing memstore is taking too long. It looks like we are doing a bunch of 
> comparing out of a new facility in hbase2, the Segment scanner at flush time.
> Below is a patch from [~anoop.hbase]. I had a similar more hacky version. 
> Both undo the extra comparing we were seeing in perf tests.
> [~anastas] and [~eshcar]. Need your help please.
> As I read it, we are trying to flush the memstore snapshot (default, no IMC 
> case). There is only ever going to be one Segment involved (even if IMC is 
> enabled); the snapshot Segment. But the getScanners is returning a list (of 
> one)  Scanners and the scan is via the generic SegmentScanner which is all 
> about a bunch of stuff we don't need when doing a flush so it seems to do 
> more work than is necessary. It also supports scanning backwards which is not 
> needed when trying to flush memstore.
> Do you see a problem doing a version of Anoops patch (whether IMC or not)? It 
> makes a big difference in general throughput when the below patch is in 
> place. Thanks.
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> index cbd60e5da3..c3dd972254 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> @@ -40,7 +40,8 @@ public class MemStoreSnapshot implements Closeable {
>  this.cellsCount = snapshot.getCellsCount();
>  this.memStoreSize = snapshot.getMemStoreSize();
>  this.timeRangeTracker = snapshot.getTimeRangeTracker();
> -this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE);
> +//this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE);
> +this.scanners = snapshot.getScannersForSnapshot();
>  this.tagsPresent = snapshot.isTagsPresent();
>}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> index 70074bf3b4..279c4e50c8 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> @@ -33,6 +33,7 @@ import org.apache.hadoop.hbase.KeyValueUtil;
>  import org.apache.hadoop.hbase.io.TimeRange;
>  import org.apache.hadoop.hbase.util.Bytes;
>  import org.apache.hadoop.hbase.util.ClassSize;
> +import org.apache.hadoop.hbase.util.CollectionBackedScanner;
>  import org.apache.yetus.audience.InterfaceAudience;
>  import org.slf4j.Logger;
>  import 
> org.apache.hbase.thirdparty.com.google.common.annotations.VisibleForTesting;
> @@ -130,6 +131,10 @@ public abstract class Segment {
>  return Collections.singletonList(new SegmentScanner(this, readPoint, 
> order));
>}
> +  public List getScannersForSnapshot() {
> +return Collections.singletonList(new 
> CollectionBackedScanner(this.cellSet.get(), comparator));
> +  }
> +
>/**
> * @return whether the segment has any cells
> */
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20634) Reopen region while server crash can cause the procedure to be stuck

2018-06-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498537#comment-16498537
 ] 

Hadoop QA commented on HBASE-20634:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
31s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m  
4s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
31s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 0s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
19s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 14m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
9s{color} | {color:red} hbase-server: The patch generated 6 new + 108 unchanged 
- 2 fixed = 114 total (was 110) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 3s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m  4s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
9s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}108m 40s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
56s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}185m 12s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.security.token.TestZKSecretWatcher |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:369877d |
| JIRA Issue | HBASE-20634 |
| JIRA Patch URL | 

[jira] [Updated] (HBASE-18116) Replication source in-memory accounting should not include bulk transfer hfiles

2018-06-01 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-18116:
---
Attachment: (was: HBASE-18116-branch-1.patch)

> Replication source in-memory accounting should not include bulk transfer 
> hfiles
> ---
>
> Key: HBASE-18116
> URL: https://issues.apache.org/jira/browse/HBASE-18116
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0
>
> Attachments: HBASE-18116.master.001.patch, 
> HBASE-18116.master.002.patch, HBASE-18116.master.003.patch
>
>
> In ReplicationSourceWALReaderThread we maintain a global quota on enqueued 
> replication work for preventing OOM by queuing up too many edits into queues 
> on heap. When calculating the size of a given replication queue entry, if it 
> has associated hfiles (is a bulk load to be replicated as a batch of hfiles), 
> we get the file sizes and include the sum. We then apply that result to the 
> quota. This isn't quite right. Those hfiles will be pulled by the sink as a 
> file copy, not pushed by the source. The cells in those files are not queued 
> in memory at the source and therefore shouldn't be counted against the quota.
> Related, the sum of the hfile sizes are also included when checking if queued 
> work exceeds the configured replication queue capacity, which is by default 
> 64 MB. HFiles are commonly much larger than this. 
> So what happens is when we encounter a bulk load replication entry typically 
> both the quota and capacity limits are exceeded, we break out of loops, and 
> send right away. What is transferred on the wire via HBase RPC though has 
> only a partial relationship to the calculation. 
> Depending how you look at it, it makes sense to factor hfile file sizes 
> against replication queue capacity limits. The sink will be occupied 
> transferring those files at the HDFS level. Anyway, this is how we have been 
> doing it and it is too late to change now. I do not however think it is 
> correct to apply hfile file sizes against a quota for in memory state on the 
> source. The source doesn't queue or even transfer those bytes. 
> Something I noticed while working on HBASE-18027.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-18116) Replication source in-memory accounting should not include bulk transfer hfiles

2018-06-01 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-18116.

Resolution: Fixed

> Replication source in-memory accounting should not include bulk transfer 
> hfiles
> ---
>
> Key: HBASE-18116
> URL: https://issues.apache.org/jira/browse/HBASE-18116
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0
>
> Attachments: HBASE-18116.master.001.patch, 
> HBASE-18116.master.002.patch, HBASE-18116.master.003.patch
>
>
> In ReplicationSourceWALReaderThread we maintain a global quota on enqueued 
> replication work for preventing OOM by queuing up too many edits into queues 
> on heap. When calculating the size of a given replication queue entry, if it 
> has associated hfiles (is a bulk load to be replicated as a batch of hfiles), 
> we get the file sizes and include the sum. We then apply that result to the 
> quota. This isn't quite right. Those hfiles will be pulled by the sink as a 
> file copy, not pushed by the source. The cells in those files are not queued 
> in memory at the source and therefore shouldn't be counted against the quota.
> Related, the sum of the hfile sizes are also included when checking if queued 
> work exceeds the configured replication queue capacity, which is by default 
> 64 MB. HFiles are commonly much larger than this. 
> So what happens is when we encounter a bulk load replication entry typically 
> both the quota and capacity limits are exceeded, we break out of loops, and 
> send right away. What is transferred on the wire via HBase RPC though has 
> only a partial relationship to the calculation. 
> Depending how you look at it, it makes sense to factor hfile file sizes 
> against replication queue capacity limits. The sink will be occupied 
> transferring those files at the HDFS level. Anyway, this is how we have been 
> doing it and it is too late to change now. I do not however think it is 
> correct to apply hfile file sizes against a quota for in memory state on the 
> source. The source doesn't queue or even transfer those bytes. 
> Something I noticed while working on HBASE-18027.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20647) Backport HBASE-20616 "TruncateTableProcedure is stuck in retry loop in TRUNCATE_TABLE_CREATE_FS_LAYOUT state" to branch-1

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498505#comment-16498505
 ] 

Hudson commented on HBASE-20647:


Results for branch branch-1.4
[build #341 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Backport HBASE-20616 "TruncateTableProcedure is stuck in retry loop in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state" to branch-1
> -
>
> Key: HBASE-20647
> URL: https://issues.apache.org/jira/browse/HBASE-20647
> Project: HBase
>  Issue Type: Sub-task
>  Components: backport
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 1.5.0, 1.4.5
>
> Attachments: HBASE-20647.branch-1.001.patch
>
>
> Backport parent issue to branch-1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20664) Variable shared across multiple threads

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498507#comment-16498507
 ] 

Hudson commented on HBASE-20664:


Results for branch branch-1.4
[build #341 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Variable shared across multiple threads
> ---
>
> Key: HBASE-20664
> URL: https://issues.apache.org/jira/browse/HBASE-20664
> Project: HBase
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.4.5, 1.2.6.1, 1.3.2.1, 2.0.0.1
>
>
> Some static analysis found a variable which was used across multiple threads 
> without any synchronization that would allow race conditions.
> The variable does not need to be a member of the class, instead just made a 
> local variable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20616) TruncateTableProcedure is stuck in retry loop in TRUNCATE_TABLE_CREATE_FS_LAYOUT state

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498506#comment-16498506
 ] 

Hudson commented on HBASE-20616:


Results for branch branch-1.4
[build #341 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> TruncateTableProcedure is stuck in retry loop in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state
> --
>
> Key: HBASE-20616
> URL: https://issues.apache.org/jira/browse/HBASE-20616
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
> Environment: HDP-2.5.3
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 2.1.0
>
> Attachments: 20616.master.004.patch, HBASE-20616.master.001.patch, 
> HBASE-20616.master.002.patch, HBASE-20616.master.003.patch, 
> HBASE-20616.master.004.patch
>
>
> At first, TruncateTableProcedure failed to write some files to HDFS in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state for some reason.
> {code:java}
> 2018-05-15 08:00:25,346 WARN  [ProcedureExecutorThread-8] 
> procedure.TruncateTableProcedure: Retriable error trying to truncate 
> table=: state=TRUNCATE_TABLE_CREATE_FS_LAYOUT
> java.io.IOException: java.util.concurrent.ExecutionException: 
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
> /apps/hbase/data/.tmp/data.regioninfo could 
> only be replicated to 0 nodes instead of minReplication (=1).  There are  number of DNs> datanode(s) running and no node(s) are excluded in this 
> operation.
> ...
> {code}
> But at this time, seemed like writing some files to HDFS was successful.
> And then, TruncateTableProcedure was stuck in retry loop in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state. At this point, the following log 
> messages were shown repeatedly in the master log:
> {code:java}
> 2018-05-15 08:00:25,463 WARN  [ProcedureExecutorThread-8] 
> procedure.TruncateTableProcedure: Retriable error trying to truncate 
> table=: state=TRUNCATE_TABLE_CREATE_FS_LAYOUT
> java.io.IOException: java.util.concurrent.ExecutionException: 
> java.io.IOException: The specified region already exists on disk: 
> hdfs:///apps/hbase/data/.tmp/data///
> ...
> {code}
> It seems like this is because TruncateTableProcedure tried to write the files 
> that were written successfully in the first try.
> I think we need to delete all the files and directories that are written 
> successfully in the previous try before retrying the 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state.
> Actually, this issue was observed in HDP-2.5.3, but I think the upstream has 
> the same issue. Also, it looks to me that CreateTableProcedure has a similar 
> issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20608) Remove build option of error prone profile for branch-1 after HBASE-12350

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498504#comment-16498504
 ] 

Hudson commented on HBASE-20608:


Results for branch branch-1.4
[build #341 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/341//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Remove build option of error prone profile for branch-1 after HBASE-12350
> -
>
> Key: HBASE-20608
> URL: https://issues.apache.org/jira/browse/HBASE-20608
> Project: HBase
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.5
>
>
> After HBASE-12350, error prone profile was introduced/backported to branch-1 
> and branch-2. However, branch-1 is still building with JDK 7 and is 
> incompatible with this error prone profile such that `mvn test-compile` 
> failed since then. 
> Open this issue to track the removal of `-PerrorProne` in the build command 
> (in Jenkins)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20669) [findbugs] autoboxing to parse primitive

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498490#comment-16498490
 ] 

Hudson commented on HBASE-20669:


SUCCESS: Integrated in Jenkins build HBase-1.2-IT #1117 (See 
[https://builds.apache.org/job/HBase-1.2-IT/1117/])
HBASE-20669 [findbugs] autoboxing to parse primitive (busbey: rev 
3b6839c5b02b669a8927664055c6461c7013978e)
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java


> [findbugs] autoboxing to parse primitive
> 
>
> Key: HBASE-20669
> URL: https://issues.apache.org/jira/browse/HBASE-20669
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2, 1.5.0, 1.2.7, 1.4.3
>Reporter: Sean Busbey
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.5
>
> Attachments: HBASE-20669.branch-1.002.patch
>
>
> HBASE-18864 introduced a change that's been flagged by findbugs. example on 
> branch-1:
> {quote}
> FindBugs  module:hbase-client
> Boxing/unboxing to parse a primitive 
> org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[], byte[]) At 
> HColumnDescriptor.java:org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[],
>  byte[]) At HColumnDescriptor.java:[line 579]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20669) [findbugs] autoboxing to parse primitive

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498463#comment-16498463
 ] 

Hudson commented on HBASE-20669:


SUCCESS: Integrated in Jenkins build HBase-1.3-IT #416 (See 
[https://builds.apache.org/job/HBase-1.3-IT/416/])
HBASE-20669 [findbugs] autoboxing to parse primitive (busbey: rev 
0cb3d511e8c193f3a0de50762b83597e9f777a70)
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java


> [findbugs] autoboxing to parse primitive
> 
>
> Key: HBASE-20669
> URL: https://issues.apache.org/jira/browse/HBASE-20669
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2, 1.5.0, 1.2.7, 1.4.3
>Reporter: Sean Busbey
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.5
>
> Attachments: HBASE-20669.branch-1.002.patch
>
>
> HBASE-18864 introduced a change that's been flagged by findbugs. example on 
> branch-1:
> {quote}
> FindBugs  module:hbase-client
> Boxing/unboxing to parse a primitive 
> org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[], byte[]) At 
> HColumnDescriptor.java:org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[],
>  byte[]) At HColumnDescriptor.java:[line 579]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20662) Increasing space quota on a violated table does not remove SpaceViolationPolicy.DISABLE enforcement

2018-06-01 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498459#comment-16498459
 ] 

Nihal Jain edited comment on HBASE-20662 at 6/1/18 7:41 PM:


{quote}Master performing enabling/disabling of the table seems to be a cleaner 
solution instead of all the RSes trying to do the same.
{quote}
Right, but I think we should create another JIRA to handle that; as the root 
cause of this issue is not multiple RSs submitting the enable request 
concurrently, but the problem is enable method not checking whether quota is in 
violation before throwing exception.
Though I can handle this change as part of this JIRA too, if required.


was (Author: nihaljain.cs):
{quote}Master performing enabling/disabling of the table seems to be a cleaner 
solution instead of all the RSes trying to do the same.
{quote}
Right, but I think we should create another JIRA to handle that; as the root 
cause of this issue is not multiple RSs submitting the enable request 
concurrently, but the problem is enable method not checking whether quota is in 
violation before throwing exception.

> Increasing space quota on a violated table does not remove 
> SpaceViolationPolicy.DISABLE enforcement
> ---
>
> Key: HBASE-20662
> URL: https://issues.apache.org/jira/browse/HBASE-20662
> Project: HBase
>  Issue Type: Bug
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20662.master.001.patch
>
>
> *Steps to reproduce*
>  * Create a table and set quota with {{SpaceViolationPolicy.DISABLE}} having 
> limit say 2MB
>  * Now put rows until space quota is violated and table gets disabled
>  * Next, increase space quota with limit say 4MB on the table
>  * Now try putting a row into the table
> {code:java}
>  private void testSetQuotaThenViolateAndFinallyIncreaseQuota() throws 
> Exception {
> SpaceViolationPolicy policy = SpaceViolationPolicy.DISABLE;
> Put put = new Put(Bytes.toBytes("to_reject"));
> put.addColumn(Bytes.toBytes(SpaceQuotaHelperForTests.F1), 
> Bytes.toBytes("to"),
>   Bytes.toBytes("reject"));
> // Do puts until we violate space policy
> final TableName tn = writeUntilViolationAndVerifyViolation(policy, put);
> // Now, increase limit
> setQuotaLimit(tn, policy, 4L);
> // Put some row now: should not violate as quota limit increased
> verifyNoViolation(policy, tn, put);
>   }
> {code}
> *Expected*
> We should be able to put data as long as newly set quota limit is not reached
> *Actual*
> We fail to put any new row even after increasing limit
> *Root cause*
> Increasing quota on a violated table triggers the table to be enabled, but 
> since the table is already in violation, the system does not allow it to be 
> enabled (may be thinking that a user is trying to enable it)
> *Relevant exception trace*
> {noformat}
> 2018-05-31 00:34:27,563 INFO  [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> client.HBaseAdmin$14(844): Started enable of 
> testSetQuotaAndThenIncreaseQuotaWithDisable0
> 2018-05-31 00:34:27,571 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=42525] 
> ipc.CallRunner(142): callId: 11 service: MasterService methodName: 
> EnableTable size: 104 connection: 127.0.0.1:38030 deadline: 1527707127568, 
> exception=org.apache.hadoop.hbase.security.AccessDeniedException: Enabling 
> the table 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to 
> a violated space quota.
> 2018-05-31 00:34:27,571 ERROR [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> quotas.RegionServerSpaceQuotaManager(210): Failed to disable space violation 
> policy for testSetQuotaAndThenIncreaseQuotaWithDisable0. This table will 
> remain in violation.
> org.apache.hadoop.hbase.security.AccessDeniedException: 
> org.apache.hadoop.hbase.security.AccessDeniedException: Enabling the table 
> 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to a 
> violated space quota.
>   at org.apache.hadoop.hbase.master.HMaster$6.run(HMaster.java:2275)
>   at 
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131)
>   at org.apache.hadoop.hbase.master.HMaster.enableTable(HMaster.java:2258)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.enableTable(MasterRpcServices.java:725)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> 

[jira] [Commented] (HBASE-20662) Increasing space quota on a violated table does not remove SpaceViolationPolicy.DISABLE enforcement

2018-06-01 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498459#comment-16498459
 ] 

Nihal Jain commented on HBASE-20662:


{quote}Master performing enabling/disabling of the table seems to be a cleaner 
solution instead of all the RSes trying to do the same.
{quote}
Right, but I think we should create another JIRA to handle that; as the root 
cause of this issue is not multiple RSs submitting the enable request 
concurrently, but the problem is enable method not checking whether quota is in 
violation before throwing exception.

> Increasing space quota on a violated table does not remove 
> SpaceViolationPolicy.DISABLE enforcement
> ---
>
> Key: HBASE-20662
> URL: https://issues.apache.org/jira/browse/HBASE-20662
> Project: HBase
>  Issue Type: Bug
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20662.master.001.patch
>
>
> *Steps to reproduce*
>  * Create a table and set quota with {{SpaceViolationPolicy.DISABLE}} having 
> limit say 2MB
>  * Now put rows until space quota is violated and table gets disabled
>  * Next, increase space quota with limit say 4MB on the table
>  * Now try putting a row into the table
> {code:java}
>  private void testSetQuotaThenViolateAndFinallyIncreaseQuota() throws 
> Exception {
> SpaceViolationPolicy policy = SpaceViolationPolicy.DISABLE;
> Put put = new Put(Bytes.toBytes("to_reject"));
> put.addColumn(Bytes.toBytes(SpaceQuotaHelperForTests.F1), 
> Bytes.toBytes("to"),
>   Bytes.toBytes("reject"));
> // Do puts until we violate space policy
> final TableName tn = writeUntilViolationAndVerifyViolation(policy, put);
> // Now, increase limit
> setQuotaLimit(tn, policy, 4L);
> // Put some row now: should not violate as quota limit increased
> verifyNoViolation(policy, tn, put);
>   }
> {code}
> *Expected*
> We should be able to put data as long as newly set quota limit is not reached
> *Actual*
> We fail to put any new row even after increasing limit
> *Root cause*
> Increasing quota on a violated table triggers the table to be enabled, but 
> since the table is already in violation, the system does not allow it to be 
> enabled (may be thinking that a user is trying to enable it)
> *Relevant exception trace*
> {noformat}
> 2018-05-31 00:34:27,563 INFO  [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> client.HBaseAdmin$14(844): Started enable of 
> testSetQuotaAndThenIncreaseQuotaWithDisable0
> 2018-05-31 00:34:27,571 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=42525] 
> ipc.CallRunner(142): callId: 11 service: MasterService methodName: 
> EnableTable size: 104 connection: 127.0.0.1:38030 deadline: 1527707127568, 
> exception=org.apache.hadoop.hbase.security.AccessDeniedException: Enabling 
> the table 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to 
> a violated space quota.
> 2018-05-31 00:34:27,571 ERROR [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> quotas.RegionServerSpaceQuotaManager(210): Failed to disable space violation 
> policy for testSetQuotaAndThenIncreaseQuotaWithDisable0. This table will 
> remain in violation.
> org.apache.hadoop.hbase.security.AccessDeniedException: 
> org.apache.hadoop.hbase.security.AccessDeniedException: Enabling the table 
> 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to a 
> violated space quota.
>   at org.apache.hadoop.hbase.master.HMaster$6.run(HMaster.java:2275)
>   at 
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131)
>   at org.apache.hadoop.hbase.master.HMaster.enableTable(HMaster.java:2258)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.enableTable(MasterRpcServices.java:725)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> 

[jira] [Commented] (HBASE-20666) Unsuccessful table creation leaves entry in rsgroup meta table

2018-06-01 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498458#comment-16498458
 ] 

Nihal Jain commented on HBASE-20666:


Hi [~gsbiju] are you working on this? If not, I can have a look at this.

> Unsuccessful table creation leaves entry in rsgroup meta table
> --
>
> Key: HBASE-20666
> URL: https://issues.apache.org/jira/browse/HBASE-20666
> Project: HBase
>  Issue Type: Bug
>Reporter: Biju Nair
>Priority: Minor
>
> If a table creation fails in a cluster enabled with {{rsgroup}} feature, the 
> table is still listed as part of {{default}} rsgroup.
> To recreate the scenario:
> - Create a namespace (NS) with number of region limit
> - Create table in the NS which satisfies the region limit by pre-splitting
> - Create a new table in the NS which will fail
> - {{list_rsgroup}} will show the table being part of {{default}} rsgroup and 
> data can be found in {{hbase:rsgroup}} table
> Would be good to revert the entry when the table creation fails or a script 
> to clean up the metadata.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20662) Increasing space quota on a violated table does not remove SpaceViolationPolicy.DISABLE enforcement

2018-06-01 Thread Biju Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498447#comment-16498447
 ] 

Biju Nair commented on HBASE-20662:
---

Master performing {{enabling/disabling}} of the table seems to be a cleaner 
solution instead of all the RSes trying to do the same. 

> Increasing space quota on a violated table does not remove 
> SpaceViolationPolicy.DISABLE enforcement
> ---
>
> Key: HBASE-20662
> URL: https://issues.apache.org/jira/browse/HBASE-20662
> Project: HBase
>  Issue Type: Bug
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20662.master.001.patch
>
>
> *Steps to reproduce*
>  * Create a table and set quota with {{SpaceViolationPolicy.DISABLE}} having 
> limit say 2MB
>  * Now put rows until space quota is violated and table gets disabled
>  * Next, increase space quota with limit say 4MB on the table
>  * Now try putting a row into the table
> {code:java}
>  private void testSetQuotaThenViolateAndFinallyIncreaseQuota() throws 
> Exception {
> SpaceViolationPolicy policy = SpaceViolationPolicy.DISABLE;
> Put put = new Put(Bytes.toBytes("to_reject"));
> put.addColumn(Bytes.toBytes(SpaceQuotaHelperForTests.F1), 
> Bytes.toBytes("to"),
>   Bytes.toBytes("reject"));
> // Do puts until we violate space policy
> final TableName tn = writeUntilViolationAndVerifyViolation(policy, put);
> // Now, increase limit
> setQuotaLimit(tn, policy, 4L);
> // Put some row now: should not violate as quota limit increased
> verifyNoViolation(policy, tn, put);
>   }
> {code}
> *Expected*
> We should be able to put data as long as newly set quota limit is not reached
> *Actual*
> We fail to put any new row even after increasing limit
> *Root cause*
> Increasing quota on a violated table triggers the table to be enabled, but 
> since the table is already in violation, the system does not allow it to be 
> enabled (may be thinking that a user is trying to enable it)
> *Relevant exception trace*
> {noformat}
> 2018-05-31 00:34:27,563 INFO  [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> client.HBaseAdmin$14(844): Started enable of 
> testSetQuotaAndThenIncreaseQuotaWithDisable0
> 2018-05-31 00:34:27,571 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=42525] 
> ipc.CallRunner(142): callId: 11 service: MasterService methodName: 
> EnableTable size: 104 connection: 127.0.0.1:38030 deadline: 1527707127568, 
> exception=org.apache.hadoop.hbase.security.AccessDeniedException: Enabling 
> the table 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to 
> a violated space quota.
> 2018-05-31 00:34:27,571 ERROR [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> quotas.RegionServerSpaceQuotaManager(210): Failed to disable space violation 
> policy for testSetQuotaAndThenIncreaseQuotaWithDisable0. This table will 
> remain in violation.
> org.apache.hadoop.hbase.security.AccessDeniedException: 
> org.apache.hadoop.hbase.security.AccessDeniedException: Enabling the table 
> 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to a 
> violated space quota.
>   at org.apache.hadoop.hbase.master.HMaster$6.run(HMaster.java:2275)
>   at 
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131)
>   at org.apache.hadoop.hbase.master.HMaster.enableTable(HMaster.java:2258)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.enableTable(MasterRpcServices.java:725)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:360)
> 

[jira] [Updated] (HBASE-20662) Increasing space quota on a violated table does not remove SpaceViolationPolicy.DISABLE enforcement

2018-06-01 Thread Nihal Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-20662:
---
Fix Version/s: 3.0.0
   Status: Patch Available  (was: Open)

> Increasing space quota on a violated table does not remove 
> SpaceViolationPolicy.DISABLE enforcement
> ---
>
> Key: HBASE-20662
> URL: https://issues.apache.org/jira/browse/HBASE-20662
> Project: HBase
>  Issue Type: Bug
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20662.master.001.patch
>
>
> *Steps to reproduce*
>  * Create a table and set quota with {{SpaceViolationPolicy.DISABLE}} having 
> limit say 2MB
>  * Now put rows until space quota is violated and table gets disabled
>  * Next, increase space quota with limit say 4MB on the table
>  * Now try putting a row into the table
> {code:java}
>  private void testSetQuotaThenViolateAndFinallyIncreaseQuota() throws 
> Exception {
> SpaceViolationPolicy policy = SpaceViolationPolicy.DISABLE;
> Put put = new Put(Bytes.toBytes("to_reject"));
> put.addColumn(Bytes.toBytes(SpaceQuotaHelperForTests.F1), 
> Bytes.toBytes("to"),
>   Bytes.toBytes("reject"));
> // Do puts until we violate space policy
> final TableName tn = writeUntilViolationAndVerifyViolation(policy, put);
> // Now, increase limit
> setQuotaLimit(tn, policy, 4L);
> // Put some row now: should not violate as quota limit increased
> verifyNoViolation(policy, tn, put);
>   }
> {code}
> *Expected*
> We should be able to put data as long as newly set quota limit is not reached
> *Actual*
> We fail to put any new row even after increasing limit
> *Root cause*
> Increasing quota on a violated table triggers the table to be enabled, but 
> since the table is already in violation, the system does not allow it to be 
> enabled (may be thinking that a user is trying to enable it)
> *Relevant exception trace*
> {noformat}
> 2018-05-31 00:34:27,563 INFO  [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> client.HBaseAdmin$14(844): Started enable of 
> testSetQuotaAndThenIncreaseQuotaWithDisable0
> 2018-05-31 00:34:27,571 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=42525] 
> ipc.CallRunner(142): callId: 11 service: MasterService methodName: 
> EnableTable size: 104 connection: 127.0.0.1:38030 deadline: 1527707127568, 
> exception=org.apache.hadoop.hbase.security.AccessDeniedException: Enabling 
> the table 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to 
> a violated space quota.
> 2018-05-31 00:34:27,571 ERROR [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> quotas.RegionServerSpaceQuotaManager(210): Failed to disable space violation 
> policy for testSetQuotaAndThenIncreaseQuotaWithDisable0. This table will 
> remain in violation.
> org.apache.hadoop.hbase.security.AccessDeniedException: 
> org.apache.hadoop.hbase.security.AccessDeniedException: Enabling the table 
> 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to a 
> violated space quota.
>   at org.apache.hadoop.hbase.master.HMaster$6.run(HMaster.java:2275)
>   at 
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131)
>   at org.apache.hadoop.hbase.master.HMaster.enableTable(HMaster.java:2258)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.enableTable(MasterRpcServices.java:725)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:360)
>   at 
> 

[jira] [Comment Edited] (HBASE-20662) Increasing space quota on a violated table does not remove SpaceViolationPolicy.DISABLE enforcement

2018-06-01 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498425#comment-16498425
 ] 

Nihal Jain edited comment on HBASE-20662 at 6/1/18 7:20 PM:


Hi [~elserj], I have attached a patch which I prepared last night. This patch 
will read the current quota snapshot from the quota table (as snapshot of 
master's memory has still not been updated and might have quota still in 
violation) and then throw exception only if quota is in violation. 

I have understood what you want to convey, but I suspect it would fix this 
issue completely as when inside enable table method nowhere do we check whether 
quota is in violation or not, hence failing to enable table. What do you think, 
will moving disable step to master solve the problem? 


was (Author: nihaljain.cs):
Hi [~elserj], I have attached a patch which I prepared yesterday night. This 
patch will read the current quota snapshot from the quota table (as snapshot of 
master's memory has still not been updated yet) and then throw exception only 
if quota is in violation. 

I have understood what you want to convey, but I suspect it would fix this 
issue completely as when inside enable table method nowhere do we check whether 
quota is in violation or not, hence failing to enable table. What do you think? 

> Increasing space quota on a violated table does not remove 
> SpaceViolationPolicy.DISABLE enforcement
> ---
>
> Key: HBASE-20662
> URL: https://issues.apache.org/jira/browse/HBASE-20662
> Project: HBase
>  Issue Type: Bug
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
> Attachments: HBASE-20662.master.001.patch
>
>
> *Steps to reproduce*
>  * Create a table and set quota with {{SpaceViolationPolicy.DISABLE}} having 
> limit say 2MB
>  * Now put rows until space quota is violated and table gets disabled
>  * Next, increase space quota with limit say 4MB on the table
>  * Now try putting a row into the table
> {code:java}
>  private void testSetQuotaThenViolateAndFinallyIncreaseQuota() throws 
> Exception {
> SpaceViolationPolicy policy = SpaceViolationPolicy.DISABLE;
> Put put = new Put(Bytes.toBytes("to_reject"));
> put.addColumn(Bytes.toBytes(SpaceQuotaHelperForTests.F1), 
> Bytes.toBytes("to"),
>   Bytes.toBytes("reject"));
> // Do puts until we violate space policy
> final TableName tn = writeUntilViolationAndVerifyViolation(policy, put);
> // Now, increase limit
> setQuotaLimit(tn, policy, 4L);
> // Put some row now: should not violate as quota limit increased
> verifyNoViolation(policy, tn, put);
>   }
> {code}
> *Expected*
> We should be able to put data as long as newly set quota limit is not reached
> *Actual*
> We fail to put any new row even after increasing limit
> *Root cause*
> Increasing quota on a violated table triggers the table to be enabled, but 
> since the table is already in violation, the system does not allow it to be 
> enabled (may be thinking that a user is trying to enable it)
> *Relevant exception trace*
> {noformat}
> 2018-05-31 00:34:27,563 INFO  [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> client.HBaseAdmin$14(844): Started enable of 
> testSetQuotaAndThenIncreaseQuotaWithDisable0
> 2018-05-31 00:34:27,571 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=42525] 
> ipc.CallRunner(142): callId: 11 service: MasterService methodName: 
> EnableTable size: 104 connection: 127.0.0.1:38030 deadline: 1527707127568, 
> exception=org.apache.hadoop.hbase.security.AccessDeniedException: Enabling 
> the table 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to 
> a violated space quota.
> 2018-05-31 00:34:27,571 ERROR [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> quotas.RegionServerSpaceQuotaManager(210): Failed to disable space violation 
> policy for testSetQuotaAndThenIncreaseQuotaWithDisable0. This table will 
> remain in violation.
> org.apache.hadoop.hbase.security.AccessDeniedException: 
> org.apache.hadoop.hbase.security.AccessDeniedException: Enabling the table 
> 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to a 
> violated space quota.
>   at org.apache.hadoop.hbase.master.HMaster$6.run(HMaster.java:2275)
>   at 
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131)
>   at org.apache.hadoop.hbase.master.HMaster.enableTable(HMaster.java:2258)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.enableTable(MasterRpcServices.java:725)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at 

[jira] [Updated] (HBASE-20669) [findbugs] autoboxing to parse primitive

2018-06-01 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20669:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

pushed to branches-1. thanks [~jojochuang]!

> [findbugs] autoboxing to parse primitive
> 
>
> Key: HBASE-20669
> URL: https://issues.apache.org/jira/browse/HBASE-20669
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2, 1.5.0, 1.2.7, 1.4.3
>Reporter: Sean Busbey
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.5
>
> Attachments: HBASE-20669.branch-1.002.patch
>
>
> HBASE-18864 introduced a change that's been flagged by findbugs. example on 
> branch-1:
> {quote}
> FindBugs  module:hbase-client
> Boxing/unboxing to parse a primitive 
> org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[], byte[]) At 
> HColumnDescriptor.java:org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[],
>  byte[]) At HColumnDescriptor.java:[line 579]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20662) Increasing space quota on a violated table does not remove SpaceViolationPolicy.DISABLE enforcement

2018-06-01 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498425#comment-16498425
 ] 

Nihal Jain edited comment on HBASE-20662 at 6/1/18 7:16 PM:


Hi [~elserj], I have attached a patch which I prepared yesterday night. This 
patch will read the current quota snapshot from the quota table (as snapshot of 
master's memory has still not been updated yet) and then throw exception only 
if quota is in violation. 

I have understood what you want to convey, but I suspect it would fix this 
issue completely as when inside enable table method nowhere do we check whether 
quota is in violation or not, hence failing to enable table. What do you think? 


was (Author: nihaljain.cs):
Hi [~elserj], I have attached a patch which I prepared yesterday night. This 
patch will read the current quota snbapshot from the quota table and then throw 
exception only if quota is in violation. I have understood what you want to 
convey, but I suspect it would fix this issue completely as when inside enable 
table method nowhere do we check whether quota is in violation or not, hence 
failing to enable table. 

> Increasing space quota on a violated table does not remove 
> SpaceViolationPolicy.DISABLE enforcement
> ---
>
> Key: HBASE-20662
> URL: https://issues.apache.org/jira/browse/HBASE-20662
> Project: HBase
>  Issue Type: Bug
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
> Attachments: HBASE-20662.master.001.patch
>
>
> *Steps to reproduce*
>  * Create a table and set quota with {{SpaceViolationPolicy.DISABLE}} having 
> limit say 2MB
>  * Now put rows until space quota is violated and table gets disabled
>  * Next, increase space quota with limit say 4MB on the table
>  * Now try putting a row into the table
> {code:java}
>  private void testSetQuotaThenViolateAndFinallyIncreaseQuota() throws 
> Exception {
> SpaceViolationPolicy policy = SpaceViolationPolicy.DISABLE;
> Put put = new Put(Bytes.toBytes("to_reject"));
> put.addColumn(Bytes.toBytes(SpaceQuotaHelperForTests.F1), 
> Bytes.toBytes("to"),
>   Bytes.toBytes("reject"));
> // Do puts until we violate space policy
> final TableName tn = writeUntilViolationAndVerifyViolation(policy, put);
> // Now, increase limit
> setQuotaLimit(tn, policy, 4L);
> // Put some row now: should not violate as quota limit increased
> verifyNoViolation(policy, tn, put);
>   }
> {code}
> *Expected*
> We should be able to put data as long as newly set quota limit is not reached
> *Actual*
> We fail to put any new row even after increasing limit
> *Root cause*
> Increasing quota on a violated table triggers the table to be enabled, but 
> since the table is already in violation, the system does not allow it to be 
> enabled (may be thinking that a user is trying to enable it)
> *Relevant exception trace*
> {noformat}
> 2018-05-31 00:34:27,563 INFO  [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> client.HBaseAdmin$14(844): Started enable of 
> testSetQuotaAndThenIncreaseQuotaWithDisable0
> 2018-05-31 00:34:27,571 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=42525] 
> ipc.CallRunner(142): callId: 11 service: MasterService methodName: 
> EnableTable size: 104 connection: 127.0.0.1:38030 deadline: 1527707127568, 
> exception=org.apache.hadoop.hbase.security.AccessDeniedException: Enabling 
> the table 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to 
> a violated space quota.
> 2018-05-31 00:34:27,571 ERROR [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> quotas.RegionServerSpaceQuotaManager(210): Failed to disable space violation 
> policy for testSetQuotaAndThenIncreaseQuotaWithDisable0. This table will 
> remain in violation.
> org.apache.hadoop.hbase.security.AccessDeniedException: 
> org.apache.hadoop.hbase.security.AccessDeniedException: Enabling the table 
> 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to a 
> violated space quota.
>   at org.apache.hadoop.hbase.master.HMaster$6.run(HMaster.java:2275)
>   at 
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131)
>   at org.apache.hadoop.hbase.master.HMaster.enableTable(HMaster.java:2258)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.enableTable(MasterRpcServices.java:725)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> 

[jira] [Commented] (HBASE-20662) Increasing space quota on a violated table does not remove SpaceViolationPolicy.DISABLE enforcement

2018-06-01 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498425#comment-16498425
 ] 

Nihal Jain commented on HBASE-20662:


Hi [~elserj], I have attached a patch which I prepared yesterday night. This 
patch will read the current quota snbapshot from the quota table and then throw 
exception only if quota is in violation. I have understood what you want to 
convey, but I suspect it would fix this issue completely as when inside enable 
table method nowhere do we check whether quota is in violation or not, hence 
failing to enable table. 

> Increasing space quota on a violated table does not remove 
> SpaceViolationPolicy.DISABLE enforcement
> ---
>
> Key: HBASE-20662
> URL: https://issues.apache.org/jira/browse/HBASE-20662
> Project: HBase
>  Issue Type: Bug
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
> Attachments: HBASE-20662.master.001.patch
>
>
> *Steps to reproduce*
>  * Create a table and set quota with {{SpaceViolationPolicy.DISABLE}} having 
> limit say 2MB
>  * Now put rows until space quota is violated and table gets disabled
>  * Next, increase space quota with limit say 4MB on the table
>  * Now try putting a row into the table
> {code:java}
>  private void testSetQuotaThenViolateAndFinallyIncreaseQuota() throws 
> Exception {
> SpaceViolationPolicy policy = SpaceViolationPolicy.DISABLE;
> Put put = new Put(Bytes.toBytes("to_reject"));
> put.addColumn(Bytes.toBytes(SpaceQuotaHelperForTests.F1), 
> Bytes.toBytes("to"),
>   Bytes.toBytes("reject"));
> // Do puts until we violate space policy
> final TableName tn = writeUntilViolationAndVerifyViolation(policy, put);
> // Now, increase limit
> setQuotaLimit(tn, policy, 4L);
> // Put some row now: should not violate as quota limit increased
> verifyNoViolation(policy, tn, put);
>   }
> {code}
> *Expected*
> We should be able to put data as long as newly set quota limit is not reached
> *Actual*
> We fail to put any new row even after increasing limit
> *Root cause*
> Increasing quota on a violated table triggers the table to be enabled, but 
> since the table is already in violation, the system does not allow it to be 
> enabled (may be thinking that a user is trying to enable it)
> *Relevant exception trace*
> {noformat}
> 2018-05-31 00:34:27,563 INFO  [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> client.HBaseAdmin$14(844): Started enable of 
> testSetQuotaAndThenIncreaseQuotaWithDisable0
> 2018-05-31 00:34:27,571 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=42525] 
> ipc.CallRunner(142): callId: 11 service: MasterService methodName: 
> EnableTable size: 104 connection: 127.0.0.1:38030 deadline: 1527707127568, 
> exception=org.apache.hadoop.hbase.security.AccessDeniedException: Enabling 
> the table 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to 
> a violated space quota.
> 2018-05-31 00:34:27,571 ERROR [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> quotas.RegionServerSpaceQuotaManager(210): Failed to disable space violation 
> policy for testSetQuotaAndThenIncreaseQuotaWithDisable0. This table will 
> remain in violation.
> org.apache.hadoop.hbase.security.AccessDeniedException: 
> org.apache.hadoop.hbase.security.AccessDeniedException: Enabling the table 
> 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to a 
> violated space quota.
>   at org.apache.hadoop.hbase.master.HMaster$6.run(HMaster.java:2275)
>   at 
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131)
>   at org.apache.hadoop.hbase.master.HMaster.enableTable(HMaster.java:2258)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.enableTable(MasterRpcServices.java:725)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> 

[jira] [Commented] (HBASE-20635) Support to convert the shaded user permission proto to client user permission object

2018-06-01 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498423#comment-16498423
 ] 

Josh Elser commented on HBASE-20635:


[~rajeshbabu] do you just need to use 
{{org.apache.hadoop.hbase.security.access.ShadedAccessControlUtil}} instead? :)

I don't, however, see a conversion for
{noformat}
UserPermission 
toUserPermission(org.apache.hadoop.hbase.shaded.protobuf.generated.AccessControlProtos.UserPermission){noformat}
like you have in your patch. Should this be added to ShadedAccessControlUtil? 
(not sure what exactly you need for Phoenix)

> Support to convert the shaded user permission proto to client user permission 
> object
> 
>
> Key: HBASE-20635
> URL: https://issues.apache.org/jira/browse/HBASE-20635
> Project: HBase
>  Issue Type: Bug
>Reporter: Rajeshbabu Chintaguntla
>Assignee: Rajeshbabu Chintaguntla
>Priority: Major
> Attachments: HBASE-20635.patch
>
>
> Currently we have API to build the protobuf UserPermission to client user 
> permission in AccessControlUtil but we cannot do the same when we use shaded 
> protobufs.
> {noformat}
>   /**
>* Converts a user permission proto to a client user permission object.
>*
>* @param proto the protobuf UserPermission
>* @return the converted UserPermission
>*/
>   public static UserPermission 
> toUserPermission(AccessControlProtos.UserPermission proto) {
> return new UserPermission(proto.getUser().toByteArray(),
> toTablePermission(proto.getPermission()));
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20662) Increasing space quota on a violated table does not remove SpaceViolationPolicy.DISABLE enforcement

2018-06-01 Thread Nihal Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-20662:
---
Attachment: HBASE-20662.master.001.patch

> Increasing space quota on a violated table does not remove 
> SpaceViolationPolicy.DISABLE enforcement
> ---
>
> Key: HBASE-20662
> URL: https://issues.apache.org/jira/browse/HBASE-20662
> Project: HBase
>  Issue Type: Bug
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
> Attachments: HBASE-20662.master.001.patch
>
>
> *Steps to reproduce*
>  * Create a table and set quota with {{SpaceViolationPolicy.DISABLE}} having 
> limit say 2MB
>  * Now put rows until space quota is violated and table gets disabled
>  * Next, increase space quota with limit say 4MB on the table
>  * Now try putting a row into the table
> {code:java}
>  private void testSetQuotaThenViolateAndFinallyIncreaseQuota() throws 
> Exception {
> SpaceViolationPolicy policy = SpaceViolationPolicy.DISABLE;
> Put put = new Put(Bytes.toBytes("to_reject"));
> put.addColumn(Bytes.toBytes(SpaceQuotaHelperForTests.F1), 
> Bytes.toBytes("to"),
>   Bytes.toBytes("reject"));
> // Do puts until we violate space policy
> final TableName tn = writeUntilViolationAndVerifyViolation(policy, put);
> // Now, increase limit
> setQuotaLimit(tn, policy, 4L);
> // Put some row now: should not violate as quota limit increased
> verifyNoViolation(policy, tn, put);
>   }
> {code}
> *Expected*
> We should be able to put data as long as newly set quota limit is not reached
> *Actual*
> We fail to put any new row even after increasing limit
> *Root cause*
> Increasing quota on a violated table triggers the table to be enabled, but 
> since the table is already in violation, the system does not allow it to be 
> enabled (may be thinking that a user is trying to enable it)
> *Relevant exception trace*
> {noformat}
> 2018-05-31 00:34:27,563 INFO  [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> client.HBaseAdmin$14(844): Started enable of 
> testSetQuotaAndThenIncreaseQuotaWithDisable0
> 2018-05-31 00:34:27,571 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=42525] 
> ipc.CallRunner(142): callId: 11 service: MasterService methodName: 
> EnableTable size: 104 connection: 127.0.0.1:38030 deadline: 1527707127568, 
> exception=org.apache.hadoop.hbase.security.AccessDeniedException: Enabling 
> the table 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to 
> a violated space quota.
> 2018-05-31 00:34:27,571 ERROR [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> quotas.RegionServerSpaceQuotaManager(210): Failed to disable space violation 
> policy for testSetQuotaAndThenIncreaseQuotaWithDisable0. This table will 
> remain in violation.
> org.apache.hadoop.hbase.security.AccessDeniedException: 
> org.apache.hadoop.hbase.security.AccessDeniedException: Enabling the table 
> 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to a 
> violated space quota.
>   at org.apache.hadoop.hbase.master.HMaster$6.run(HMaster.java:2275)
>   at 
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131)
>   at org.apache.hadoop.hbase.master.HMaster.enableTable(HMaster.java:2258)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.enableTable(MasterRpcServices.java:725)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:360)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:348)
>   at 
> 

[jira] [Updated] (HBASE-20668) Avoid permission change if ExportSnapshot's copy fails

2018-06-01 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-20668:
---
Summary: Avoid permission change if ExportSnapshot's copy fails  (was: 
Exception from FileSystem operation in finally block of ExportSnapshot#doWork 
may hide exception from FileUtil.copy call)

> Avoid permission change if ExportSnapshot's copy fails
> --
>
> Key: HBASE-20668
> URL: https://issues.apache.org/jira/browse/HBASE-20668
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20668.v1.txt, 20668.v2.txt
>
>
> I was debugging the following error [~romil.choksi] saw during testing 
> ExportSnapshot :
> {code}
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|2018-06-01 02:40:52,358 
> ERROR [main] util.AbstractHBaseTool: Error  running command-line tool
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|java.io.FileNotFoundException:
>  Directory/File does not exist /apps/ 
> hbase/data/.hbase-snapshot/.tmp/snapshot_table_334546
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.  
> checkOwner(FSDirectory.java:1777)
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.  
> setOwner(FSDirAttrOp.java:82)
> {code}
> Here is corresponding code (with extra log added):
> {code}
> try {
>   LOG.info("Copy Snapshot Manifest from " + snapshotDir + " to " + 
> initialOutputSnapshotDir);
>   boolean ret = FileUtil.copy(inputFs, snapshotDir, outputFs, 
> initialOutputSnapshotDir, false,
>   false, conf);
>   LOG.info("return val = " + ret);
> } catch (IOException e) {
>   LOG.warn("Failed to copy the snapshot directory: from=" +
>   snapshotDir + " to=" + initialOutputSnapshotDir, e);
>   throw new ExportSnapshotException("Failed to copy the snapshot 
> directory: from=" +
> snapshotDir + " to=" + initialOutputSnapshotDir, e);
> } finally {
>   if (filesUser != null || filesGroup != null) {
> LOG.warn((filesUser == null ? "" : "Change the owner of " + 
> needSetOwnerDir + " to "
> + filesUser)
> + (filesGroup == null ? "" : ", Change the group of " + 
> needSetOwnerDir + " to "
> + filesGroup));
> setOwner(outputFs, needSetOwnerDir, filesUser, filesGroup, true);
>   }
> {code}
> "return val = " was not seen in rerun of the test.
> This is what the additional log revealed:
> {code}
> 2018-06-01 09:22:54,247|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|2018-06-01 09:22:54,241 WARN 
>  [main] snapshot.ExportSnapshot: Failed to copy the snapshot directory: 
> from=hdfs://ns1/apps/hbase/data/.hbase-snapshot/snapshot_table_157842 
> to=hdfs://ns3/apps/hbase/data/.hbase-snapshot/.tmp/snapshot_table_157842
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|org.apache.hadoop.security.AccessControlException:
>  Permission denied:   user=hbase, access=WRITE, 
> inode="/apps/hbase/data/.hbase-snapshot/.tmp":hrt_qa:hadoop:drx-wT
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> check(FSPermissionChecker.java:399)
> 2018-06-01 09:22:54,249|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> checkPermission(FSPermissionChecker.java:255)
> {code}
> It turned out that the exception from {{setOwner}} call in the finally block 
> eclipsed the real exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20668) Avoid permission change if ExportSnapshot's copy fails

2018-06-01 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-20668:
---
Description: 
The exception from FileSystem operation in the finally block of 
ExportSnapshot#doWork may hide the real exception from FileUtil.copy call.

I was debugging the following error [~romil.choksi] saw during testing 
ExportSnapshot :
{code:java}
2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|2018-06-01 02:40:52,358 ERROR 
[main] util.AbstractHBaseTool: Error  running command-line tool
2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|java.io.FileNotFoundException: 
Directory/File does not exist /apps/ 
hbase/data/.hbase-snapshot/.tmp/snapshot_table_334546
2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.  
checkOwner(FSDirectory.java:1777)
2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.  
setOwner(FSDirAttrOp.java:82)
{code}
Here is corresponding code (with extra log added):
{code:java}
try {
  LOG.info("Copy Snapshot Manifest from " + snapshotDir + " to " + 
initialOutputSnapshotDir);
  boolean ret = FileUtil.copy(inputFs, snapshotDir, outputFs, 
initialOutputSnapshotDir, false,
  false, conf);
  LOG.info("return val = " + ret);
} catch (IOException e) {
  LOG.warn("Failed to copy the snapshot directory: from=" +
  snapshotDir + " to=" + initialOutputSnapshotDir, e);
  throw new ExportSnapshotException("Failed to copy the snapshot directory: 
from=" +
snapshotDir + " to=" + initialOutputSnapshotDir, e);
} finally {
  if (filesUser != null || filesGroup != null) {
LOG.warn((filesUser == null ? "" : "Change the owner of " + 
needSetOwnerDir + " to "
+ filesUser)
+ (filesGroup == null ? "" : ", Change the group of " + 
needSetOwnerDir + " to "
+ filesGroup));
setOwner(outputFs, needSetOwnerDir, filesUser, filesGroup, true);
  }
{code}
"return val = " was not seen in rerun of the test.
 This is what the additional log revealed:
{code:java}
2018-06-01 09:22:54,247|INFO|MainThread|machine.py:167 - 
run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|2018-06-01 09:22:54,241 WARN  
[main] snapshot.ExportSnapshot: Failed to copy the snapshot directory: 
from=hdfs://ns1/apps/hbase/data/.hbase-snapshot/snapshot_table_157842 
to=hdfs://ns3/apps/hbase/data/.hbase-snapshot/.tmp/snapshot_table_157842
2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|org.apache.hadoop.security.AccessControlException:
 Permission denied:   user=hbase, access=WRITE, 
inode="/apps/hbase/data/.hbase-snapshot/.tmp":hrt_qa:hadoop:drx-wT
2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
check(FSPermissionChecker.java:399)
2018-06-01 09:22:54,249|INFO|MainThread|machine.py:167 - 
run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
checkPermission(FSPermissionChecker.java:255)
{code}
It turned out that the exception from {{setOwner}} call in the finally block 
eclipsed the real exception.

  was:
I was debugging the following error [~romil.choksi] saw during testing 
ExportSnapshot :
{code}
2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|2018-06-01 02:40:52,358 ERROR 
[main] util.AbstractHBaseTool: Error  running command-line tool
2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|java.io.FileNotFoundException: 
Directory/File does not exist /apps/ 
hbase/data/.hbase-snapshot/.tmp/snapshot_table_334546
2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.  
checkOwner(FSDirectory.java:1777)
2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.  
setOwner(FSDirAttrOp.java:82)
{code}
Here is corresponding code (with extra log added):
{code}
try {
  LOG.info("Copy Snapshot Manifest from " + snapshotDir + " to " + 
initialOutputSnapshotDir);
  boolean ret = FileUtil.copy(inputFs, snapshotDir, outputFs, 
initialOutputSnapshotDir, false,
  false, conf);
  LOG.info("return val = " + ret);
} catch (IOException 

[jira] [Resolved] (HBASE-20496) TestGlobalThrottler failing on branch-1 since revert of HBASE-9465

2018-06-01 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-20496.

Resolution: Fixed

Resolved as part of HBASE-18116

> TestGlobalThrottler failing on branch-1 since revert of HBASE-9465
> --
>
> Key: HBASE-20496
> URL: https://issues.apache.org/jira/browse/HBASE-20496
> Project: HBase
>  Issue Type: Bug
>Reporter: Andrew Purtell
>Priority: Minor
>
> Not sure why we didn't catch it earlier, but with my latest dev setup 
> including 8u JVM, TestGlobalThrottler fails reliably, and a git bisect finds 
> the problem at this revert:
> {noformat}
> commit ba7a936f74985eb9d974fdc87b0d06cb8cd8473d
> Author: Sean Busbey 
> Date: Tue Nov 7 23:50:35 2017 -0600
> Revert "HBASE-9465 Push entries to peer clusters serially"
> This reverts commit 441bc050b991c14c048617bc443b97f46e21b76f.
> Conflicts:
> hbase-client/src/main/java/org/apache/hadoop/hbase/MetaTableAccessor.java
> hbase-client/src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java
> hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java
> hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java
> hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
> Signed-off-by: Andrew Purtell 
> {noformat}
> For now I'm going to disable the test. Leaving this open for debugging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20668) Exception from FileSystem operation in finally block of ExportSnapshot#doWork may hide exception from FileUtil.copy call

2018-06-01 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498405#comment-16498405
 ] 

Josh Elser commented on HBASE-20668:


+1 on v2

Let me update the Jira issue description to give some more clarity here.

> Exception from FileSystem operation in finally block of ExportSnapshot#doWork 
> may hide exception from FileUtil.copy call
> 
>
> Key: HBASE-20668
> URL: https://issues.apache.org/jira/browse/HBASE-20668
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20668.v1.txt, 20668.v2.txt
>
>
> I was debugging the following error [~romil.choksi] saw during testing 
> ExportSnapshot :
> {code}
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|2018-06-01 02:40:52,358 
> ERROR [main] util.AbstractHBaseTool: Error  running command-line tool
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|java.io.FileNotFoundException:
>  Directory/File does not exist /apps/ 
> hbase/data/.hbase-snapshot/.tmp/snapshot_table_334546
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.  
> checkOwner(FSDirectory.java:1777)
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.  
> setOwner(FSDirAttrOp.java:82)
> {code}
> Here is corresponding code (with extra log added):
> {code}
> try {
>   LOG.info("Copy Snapshot Manifest from " + snapshotDir + " to " + 
> initialOutputSnapshotDir);
>   boolean ret = FileUtil.copy(inputFs, snapshotDir, outputFs, 
> initialOutputSnapshotDir, false,
>   false, conf);
>   LOG.info("return val = " + ret);
> } catch (IOException e) {
>   LOG.warn("Failed to copy the snapshot directory: from=" +
>   snapshotDir + " to=" + initialOutputSnapshotDir, e);
>   throw new ExportSnapshotException("Failed to copy the snapshot 
> directory: from=" +
> snapshotDir + " to=" + initialOutputSnapshotDir, e);
> } finally {
>   if (filesUser != null || filesGroup != null) {
> LOG.warn((filesUser == null ? "" : "Change the owner of " + 
> needSetOwnerDir + " to "
> + filesUser)
> + (filesGroup == null ? "" : ", Change the group of " + 
> needSetOwnerDir + " to "
> + filesGroup));
> setOwner(outputFs, needSetOwnerDir, filesUser, filesGroup, true);
>   }
> {code}
> "return val = " was not seen in rerun of the test.
> This is what the additional log revealed:
> {code}
> 2018-06-01 09:22:54,247|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|2018-06-01 09:22:54,241 WARN 
>  [main] snapshot.ExportSnapshot: Failed to copy the snapshot directory: 
> from=hdfs://ns1/apps/hbase/data/.hbase-snapshot/snapshot_table_157842 
> to=hdfs://ns3/apps/hbase/data/.hbase-snapshot/.tmp/snapshot_table_157842
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|org.apache.hadoop.security.AccessControlException:
>  Permission denied:   user=hbase, access=WRITE, 
> inode="/apps/hbase/data/.hbase-snapshot/.tmp":hrt_qa:hadoop:drx-wT
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> check(FSPermissionChecker.java:399)
> 2018-06-01 09:22:54,249|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> checkPermission(FSPermissionChecker.java:255)
> {code}
> It turned out that the exception from {{setOwner}} call in the finally block 
> eclipsed the real exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20669) [findbugs] autoboxing to parse primitive

2018-06-01 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498362#comment-16498362
 ] 

Sean Busbey commented on HBASE-20669:
-

+1

bq. 0   findbugs0m 1s   Findbugs executables are not available.

This is concerning. I'll run it locally. we should track that this is off in a 
jira so it can get fixed.

bq. -1  test4tests  0m 0s   The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch.

This doesn't fit scope for a new or changed test since it just changes an 
implementation detail and not the outcome.

> [findbugs] autoboxing to parse primitive
> 
>
> Key: HBASE-20669
> URL: https://issues.apache.org/jira/browse/HBASE-20669
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2, 1.5.0, 1.2.7, 1.4.3
>Reporter: Sean Busbey
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.5
>
> Attachments: HBASE-20669.branch-1.002.patch
>
>
> HBASE-18864 introduced a change that's been flagged by findbugs. example on 
> branch-1:
> {quote}
> FindBugs  module:hbase-client
> Boxing/unboxing to parse a primitive 
> org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[], byte[]) At 
> HColumnDescriptor.java:org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[],
>  byte[]) At HColumnDescriptor.java:[line 579]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20669) [findbugs] autoboxing to parse primitive

2018-06-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498356#comment-16498356
 ] 

Hadoop QA commented on HBASE-20669:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
12s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
17s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
16s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
1m 26s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
28s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 15m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:36a7029 |
| JIRA Issue | HBASE-20669 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12926114/HBASE-20669.branch-1.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux c28ded46a1fb 4.4.0-43-generic 

[jira] [Updated] (HBASE-20670) NPE in HMaster#isInMaintenanceMode

2018-06-01 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20670:
---
Description: 
{noformat}
Problem accessing /master-status. Reason: INTERNAL_SERVER_ERROR
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hbase.master.HMaster.isInMaintenanceMode(HMaster.java:2559)
{noformat}

The ZK trackers, including the maintenance mode tracker, are initialized only 
after we try to bring up the filesystem. If HDFS is in safe mode and the master 
is waiting on that, when an access to the master status page comes in we trip 
over this problem. There might be other issues after we fix this, but NPE Is 
always a bug, so let's address it. One option is to connect the ZK based 
components with ZK before attempting to bring up the filesystem. Let me try 
that first. If that doesn't work we could at least throw an IOE.

  was:
{noformat}
Problem accessing /master-status. Reason: INTERNAL_SERVER_ERROR
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hbase.master.HMaster.isInMaintenanceMode(HMaster.java:2559)
{noformat}

The ZK trackers, including the maintenance mode tracker, are initialized only 
after we try to bring up the filesystem. If HDFS is in safe mode then an access 
to the master status page trips over this problem. There might be other issues 
after we fix this, but NPE Is always a bug, so let's address it. One option is 
to connect the ZK based components with ZK before attempting to bring up the 
filesystem. Let me try that first. If that doesn't work we could at least throw 
an IOE.


> NPE in HMaster#isInMaintenanceMode
> --
>
> Key: HBASE-20670
> URL: https://issues.apache.org/jira/browse/HBASE-20670
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 1.5.0, 1.3.3, 1.4.5
>
>
> {noformat}
> Problem accessing /master-status. Reason: INTERNAL_SERVER_ERROR
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.master.HMaster.isInMaintenanceMode(HMaster.java:2559)
> {noformat}
> The ZK trackers, including the maintenance mode tracker, are initialized only 
> after we try to bring up the filesystem. If HDFS is in safe mode and the 
> master is waiting on that, when an access to the master status page comes in 
> we trip over this problem. There might be other issues after we fix this, but 
> NPE Is always a bug, so let's address it. One option is to connect the ZK 
> based components with ZK before attempting to bring up the filesystem. Let me 
> try that first. If that doesn't work we could at least throw an IOE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20670) NPE in HMaster#isInMaintenanceMode

2018-06-01 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-20670:
--

 Summary: NPE in HMaster#isInMaintenanceMode
 Key: HBASE-20670
 URL: https://issues.apache.org/jira/browse/HBASE-20670
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.3.2
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 1.5.0, 1.3.3, 1.4.5


{noformat}
Problem accessing /master-status. Reason: INTERNAL_SERVER_ERROR
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hbase.master.HMaster.isInMaintenanceMode(HMaster.java:2559)
{noformat}

The ZK trackers, including the maintenance mode tracker, are initialized only 
after we try to bring up the filesystem. If HDFS is in safe mode then an access 
to the master status page trips over this problem. There might be other issues 
after we fix this, but NPE Is always a bug, so let's address it. One option is 
to connect the ZK based components with ZK before attempting to bring up the 
filesystem. Let me try that first. If that doesn't work we could at least throw 
an IOE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20669) [findbugs] autoboxing to parse primitive

2018-06-01 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HBASE-20669:

Status: Patch Available  (was: Open)

> [findbugs] autoboxing to parse primitive
> 
>
> Key: HBASE-20669
> URL: https://issues.apache.org/jira/browse/HBASE-20669
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.3, 1.3.2, 1.5.0, 1.2.7
>Reporter: Sean Busbey
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.5
>
> Attachments: HBASE-20669.branch-1.002.patch
>
>
> HBASE-18864 introduced a change that's been flagged by findbugs. example on 
> branch-1:
> {quote}
> FindBugs  module:hbase-client
> Boxing/unboxing to parse a primitive 
> org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[], byte[]) At 
> HColumnDescriptor.java:org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[],
>  byte[]) At HColumnDescriptor.java:[line 579]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20634) Reopen region while server crash can cause the procedure to be stuck

2018-06-01 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498298#comment-16498298
 ] 

stack commented on HBASE-20634:
---

Retry patch 006; failed git checkout on precommit.

> Reopen region while server crash can cause the procedure to be stuck
> 
>
> Key: HBASE-20634
> URL: https://issues.apache.org/jira/browse/HBASE-20634
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>Assignee: stack
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20634-UT.patch, HBASE-20634.branch-2.0.001.patch, 
> HBASE-20634.branch-2.0.002.patch, HBASE-20634.branch-2.0.003.patch, 
> HBASE-20634.branch-2.0.004.patch, HBASE-20634.branch-2.0.005.patch, 
> HBASE-20634.branch-2.0.006.patch, HBASE-20634.branch-2.0.006.patch
>
>
> Found this when implementing HBASE-20424, where we will transit the peer sync 
> replication state while there is server crash.
> The problem is that, in ServerCrashAssign, we do not have the region lock, so 
> it is possible that after we call handleRIT to clear the existing 
> assign/unassign procedures related to this rs, and before we schedule the 
> assign procedures, it is possible that that we schedule a unassign procedure 
> for a region on the crashed rs. This procedure will not receive the 
> ServerCrashException, instead, in addToRemoteDispatcher, it will find that it 
> can not dispatch the remote call and then a  FailedRemoteDispatchException 
> will be raised. But we do not treat this exception the same with 
> ServerCrashException, instead, we will try to expire the rs. Obviously the rs 
> has already been marked as expired, so this is almost a no-op. Then the 
> procedure will be stuck there for ever.
> A possible way to fix it is to treat FailedRemoteDispatchException the same 
> with ServerCrashException, as it will be created in addToRemoteDispatcher 
> only, and the only reason we can not dispatch a remote call is that the rs 
> has already been dead. The nodeMap is a ConcurrentMap so I think we could use 
> it as a guard.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20634) Reopen region while server crash can cause the procedure to be stuck

2018-06-01 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20634:
--
Attachment: HBASE-20634.branch-2.0.006.patch

> Reopen region while server crash can cause the procedure to be stuck
> 
>
> Key: HBASE-20634
> URL: https://issues.apache.org/jira/browse/HBASE-20634
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>Assignee: stack
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20634-UT.patch, HBASE-20634.branch-2.0.001.patch, 
> HBASE-20634.branch-2.0.002.patch, HBASE-20634.branch-2.0.003.patch, 
> HBASE-20634.branch-2.0.004.patch, HBASE-20634.branch-2.0.005.patch, 
> HBASE-20634.branch-2.0.006.patch, HBASE-20634.branch-2.0.006.patch
>
>
> Found this when implementing HBASE-20424, where we will transit the peer sync 
> replication state while there is server crash.
> The problem is that, in ServerCrashAssign, we do not have the region lock, so 
> it is possible that after we call handleRIT to clear the existing 
> assign/unassign procedures related to this rs, and before we schedule the 
> assign procedures, it is possible that that we schedule a unassign procedure 
> for a region on the crashed rs. This procedure will not receive the 
> ServerCrashException, instead, in addToRemoteDispatcher, it will find that it 
> can not dispatch the remote call and then a  FailedRemoteDispatchException 
> will be raised. But we do not treat this exception the same with 
> ServerCrashException, instead, we will try to expire the rs. Obviously the rs 
> has already been marked as expired, so this is almost a no-op. Then the 
> procedure will be stuck there for ever.
> A possible way to fix it is to treat FailedRemoteDispatchException the same 
> with ServerCrashException, as it will be created in addToRemoteDispatcher 
> only, and the only reason we can not dispatch a remote call is that the rs 
> has already been dead. The nodeMap is a ConcurrentMap so I think we could use 
> it as a guard.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20592) Create a tool to verify tables do not have prefix tree encoding

2018-06-01 Thread Peter Somogyi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Somogyi updated HBASE-20592:
--
   Resolution: Fixed
Fix Version/s: 2.0.1
   3.0.0
 Release Note: PreUpgradeValidator tool with DataBlockEncoding validator 
was added to verify cluster is upgradable to HBase 2.
   Status: Resolved  (was: Patch Available)

Pushed to branch-2.0 and up. Documentation part for upgrading.adoc was left out 
for branch-2 since it does not have 'Upgrading from 1.x to 2.x' section. Thanks 
for the reviews.

> Create a tool to verify tables do not have prefix tree encoding
> ---
>
> Key: HBASE-20592
> URL: https://issues.apache.org/jira/browse/HBASE-20592
> Project: HBase
>  Issue Type: New Feature
>  Components: Operability, tooling
>Reporter: Peter Somogyi
>Assignee: Peter Somogyi
>Priority: Minor
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20592.master.001.patch, 
> HBASE-20592.master.002.patch, HBASE-20592.master.003.patch, 
> HBASE-20592.master.004.patch, HBASE-20592.master.005.patch, 
> HBASE-20592.master.006.patch
>
>
> HBase 2.0.0 removed PREFIX_TREE encoding so users need to modify data block 
> encoding to something else before upgrading to HBase 2.0+. A tool would help 
> users to verify that there are no tables left with PREFIX_TREE encoding.
> The tool needs to check the following:
>  * There are no tables where DATA_BLOCK_ENCODING => 'PREFIX_TREE'
>  * -Check existing hfiles that none of them have PREFIX_TREE encoding (in 
> case table description is changed but hfiles were not rewritten)-



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18116) Replication source in-memory accounting should not include bulk transfer hfiles

2018-06-01 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498233#comment-16498233
 ] 

Andrew Purtell commented on HBASE-18116:


Thanks [~xucang] let me try to commit this again.

> Replication source in-memory accounting should not include bulk transfer 
> hfiles
> ---
>
> Key: HBASE-18116
> URL: https://issues.apache.org/jira/browse/HBASE-18116
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0
>
> Attachments: HBASE-18116-branch-1.patch, 
> HBASE-18116.master.001.patch, HBASE-18116.master.002.patch, 
> HBASE-18116.master.003.patch
>
>
> In ReplicationSourceWALReaderThread we maintain a global quota on enqueued 
> replication work for preventing OOM by queuing up too many edits into queues 
> on heap. When calculating the size of a given replication queue entry, if it 
> has associated hfiles (is a bulk load to be replicated as a batch of hfiles), 
> we get the file sizes and include the sum. We then apply that result to the 
> quota. This isn't quite right. Those hfiles will be pulled by the sink as a 
> file copy, not pushed by the source. The cells in those files are not queued 
> in memory at the source and therefore shouldn't be counted against the quota.
> Related, the sum of the hfile sizes are also included when checking if queued 
> work exceeds the configured replication queue capacity, which is by default 
> 64 MB. HFiles are commonly much larger than this. 
> So what happens is when we encounter a bulk load replication entry typically 
> both the quota and capacity limits are exceeded, we break out of loops, and 
> send right away. What is transferred on the wire via HBase RPC though has 
> only a partial relationship to the calculation. 
> Depending how you look at it, it makes sense to factor hfile file sizes 
> against replication queue capacity limits. The sink will be occupied 
> transferring those files at the HDFS level. Anyway, this is how we have been 
> doing it and it is too late to change now. I do not however think it is 
> correct to apply hfile file sizes against a quota for in memory state on the 
> source. The source doesn't queue or even transfer those bytes. 
> Something I noticed while working on HBASE-18027.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20668) Exception from FileSystem operation in finally block of ExportSnapshot#doWork may hide exception from FileUtil.copy call

2018-06-01 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498212#comment-16498212
 ] 

Ted Yu commented on HBASE-20668:


Yes.
The patch is to give operator correct information w.r.t. the cause of failure 
instead of letting them stare at cryptic exception.

> Exception from FileSystem operation in finally block of ExportSnapshot#doWork 
> may hide exception from FileUtil.copy call
> 
>
> Key: HBASE-20668
> URL: https://issues.apache.org/jira/browse/HBASE-20668
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20668.v1.txt, 20668.v2.txt
>
>
> I was debugging the following error [~romil.choksi] saw during testing 
> ExportSnapshot :
> {code}
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|2018-06-01 02:40:52,358 
> ERROR [main] util.AbstractHBaseTool: Error  running command-line tool
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|java.io.FileNotFoundException:
>  Directory/File does not exist /apps/ 
> hbase/data/.hbase-snapshot/.tmp/snapshot_table_334546
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.  
> checkOwner(FSDirectory.java:1777)
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.  
> setOwner(FSDirAttrOp.java:82)
> {code}
> Here is corresponding code (with extra log added):
> {code}
> try {
>   LOG.info("Copy Snapshot Manifest from " + snapshotDir + " to " + 
> initialOutputSnapshotDir);
>   boolean ret = FileUtil.copy(inputFs, snapshotDir, outputFs, 
> initialOutputSnapshotDir, false,
>   false, conf);
>   LOG.info("return val = " + ret);
> } catch (IOException e) {
>   LOG.warn("Failed to copy the snapshot directory: from=" +
>   snapshotDir + " to=" + initialOutputSnapshotDir, e);
>   throw new ExportSnapshotException("Failed to copy the snapshot 
> directory: from=" +
> snapshotDir + " to=" + initialOutputSnapshotDir, e);
> } finally {
>   if (filesUser != null || filesGroup != null) {
> LOG.warn((filesUser == null ? "" : "Change the owner of " + 
> needSetOwnerDir + " to "
> + filesUser)
> + (filesGroup == null ? "" : ", Change the group of " + 
> needSetOwnerDir + " to "
> + filesGroup));
> setOwner(outputFs, needSetOwnerDir, filesUser, filesGroup, true);
>   }
> {code}
> "return val = " was not seen in rerun of the test.
> This is what the additional log revealed:
> {code}
> 2018-06-01 09:22:54,247|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|2018-06-01 09:22:54,241 WARN 
>  [main] snapshot.ExportSnapshot: Failed to copy the snapshot directory: 
> from=hdfs://ns1/apps/hbase/data/.hbase-snapshot/snapshot_table_157842 
> to=hdfs://ns3/apps/hbase/data/.hbase-snapshot/.tmp/snapshot_table_157842
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|org.apache.hadoop.security.AccessControlException:
>  Permission denied:   user=hbase, access=WRITE, 
> inode="/apps/hbase/data/.hbase-snapshot/.tmp":hrt_qa:hadoop:drx-wT
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> check(FSPermissionChecker.java:399)
> 2018-06-01 09:22:54,249|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> checkPermission(FSPermissionChecker.java:255)
> {code}
> It turned out that the exception from {{setOwner}} call in the finally block 
> eclipsed the real exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20668) Exception from FileSystem operation in finally block of ExportSnapshot#doWork may hide exception from FileUtil.copy call

2018-06-01 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498206#comment-16498206
 ] 

Josh Elser commented on HBASE-20668:


{quote}It seems the ownership of these two directories should be handled in 
another JIRA.
{quote}
Yeah, it just took me a while to see what the problem was. It seems like it is 
simply: "we should not chown/chmod on the destination filesystem unless the 
copy to that destination filesystem was successful". Is that right?

> Exception from FileSystem operation in finally block of ExportSnapshot#doWork 
> may hide exception from FileUtil.copy call
> 
>
> Key: HBASE-20668
> URL: https://issues.apache.org/jira/browse/HBASE-20668
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20668.v1.txt, 20668.v2.txt
>
>
> I was debugging the following error [~romil.choksi] saw during testing 
> ExportSnapshot :
> {code}
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|2018-06-01 02:40:52,358 
> ERROR [main] util.AbstractHBaseTool: Error  running command-line tool
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|java.io.FileNotFoundException:
>  Directory/File does not exist /apps/ 
> hbase/data/.hbase-snapshot/.tmp/snapshot_table_334546
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.  
> checkOwner(FSDirectory.java:1777)
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.  
> setOwner(FSDirAttrOp.java:82)
> {code}
> Here is corresponding code (with extra log added):
> {code}
> try {
>   LOG.info("Copy Snapshot Manifest from " + snapshotDir + " to " + 
> initialOutputSnapshotDir);
>   boolean ret = FileUtil.copy(inputFs, snapshotDir, outputFs, 
> initialOutputSnapshotDir, false,
>   false, conf);
>   LOG.info("return val = " + ret);
> } catch (IOException e) {
>   LOG.warn("Failed to copy the snapshot directory: from=" +
>   snapshotDir + " to=" + initialOutputSnapshotDir, e);
>   throw new ExportSnapshotException("Failed to copy the snapshot 
> directory: from=" +
> snapshotDir + " to=" + initialOutputSnapshotDir, e);
> } finally {
>   if (filesUser != null || filesGroup != null) {
> LOG.warn((filesUser == null ? "" : "Change the owner of " + 
> needSetOwnerDir + " to "
> + filesUser)
> + (filesGroup == null ? "" : ", Change the group of " + 
> needSetOwnerDir + " to "
> + filesGroup));
> setOwner(outputFs, needSetOwnerDir, filesUser, filesGroup, true);
>   }
> {code}
> "return val = " was not seen in rerun of the test.
> This is what the additional log revealed:
> {code}
> 2018-06-01 09:22:54,247|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|2018-06-01 09:22:54,241 WARN 
>  [main] snapshot.ExportSnapshot: Failed to copy the snapshot directory: 
> from=hdfs://ns1/apps/hbase/data/.hbase-snapshot/snapshot_table_157842 
> to=hdfs://ns3/apps/hbase/data/.hbase-snapshot/.tmp/snapshot_table_157842
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|org.apache.hadoop.security.AccessControlException:
>  Permission denied:   user=hbase, access=WRITE, 
> inode="/apps/hbase/data/.hbase-snapshot/.tmp":hrt_qa:hadoop:drx-wT
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> check(FSPermissionChecker.java:399)
> 2018-06-01 09:22:54,249|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> checkPermission(FSPermissionChecker.java:255)
> {code}
> It turned out that the exception from {{setOwner}} call in the finally block 
> eclipsed the real exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19724) Fix Checkstyle errors in hbase-hadoop2-compat

2018-06-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498198#comment-16498198
 ] 

Hudson commented on HBASE-19724:


Results for branch branch-2.0
[build #375 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/375/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/375//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/375//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/375//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Fix Checkstyle errors in hbase-hadoop2-compat
> -
>
> Key: HBASE-19724
> URL: https://issues.apache.org/jira/browse/HBASE-19724
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jan Hentschel
>Assignee: Jan Hentschel
>Priority: Minor
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-19724.master.001.patch, 
> HBASE-19724.master.002.patch, HBASE-19724.master.003.patch, 
> HBASE-19724.master.004.patch
>
>
> Fix the remaining Checkstyle errors in the *hbase-hadoop2-compat* module and 
> enable Checkstyle to fail on violations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20668) Exception from FileSystem operation in finally block of ExportSnapshot#doWork may hide exception from FileUtil.copy call

2018-06-01 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498197#comment-16498197
 ] 

Ted Yu commented on HBASE-20668:


It seems the ownership of these two directories should be handled in another 
JIRA.

What do you think ?

> Exception from FileSystem operation in finally block of ExportSnapshot#doWork 
> may hide exception from FileUtil.copy call
> 
>
> Key: HBASE-20668
> URL: https://issues.apache.org/jira/browse/HBASE-20668
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20668.v1.txt, 20668.v2.txt
>
>
> I was debugging the following error [~romil.choksi] saw during testing 
> ExportSnapshot :
> {code}
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|2018-06-01 02:40:52,358 
> ERROR [main] util.AbstractHBaseTool: Error  running command-line tool
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|java.io.FileNotFoundException:
>  Directory/File does not exist /apps/ 
> hbase/data/.hbase-snapshot/.tmp/snapshot_table_334546
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.  
> checkOwner(FSDirectory.java:1777)
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.  
> setOwner(FSDirAttrOp.java:82)
> {code}
> Here is corresponding code (with extra log added):
> {code}
> try {
>   LOG.info("Copy Snapshot Manifest from " + snapshotDir + " to " + 
> initialOutputSnapshotDir);
>   boolean ret = FileUtil.copy(inputFs, snapshotDir, outputFs, 
> initialOutputSnapshotDir, false,
>   false, conf);
>   LOG.info("return val = " + ret);
> } catch (IOException e) {
>   LOG.warn("Failed to copy the snapshot directory: from=" +
>   snapshotDir + " to=" + initialOutputSnapshotDir, e);
>   throw new ExportSnapshotException("Failed to copy the snapshot 
> directory: from=" +
> snapshotDir + " to=" + initialOutputSnapshotDir, e);
> } finally {
>   if (filesUser != null || filesGroup != null) {
> LOG.warn((filesUser == null ? "" : "Change the owner of " + 
> needSetOwnerDir + " to "
> + filesUser)
> + (filesGroup == null ? "" : ", Change the group of " + 
> needSetOwnerDir + " to "
> + filesGroup));
> setOwner(outputFs, needSetOwnerDir, filesUser, filesGroup, true);
>   }
> {code}
> "return val = " was not seen in rerun of the test.
> This is what the additional log revealed:
> {code}
> 2018-06-01 09:22:54,247|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|2018-06-01 09:22:54,241 WARN 
>  [main] snapshot.ExportSnapshot: Failed to copy the snapshot directory: 
> from=hdfs://ns1/apps/hbase/data/.hbase-snapshot/snapshot_table_157842 
> to=hdfs://ns3/apps/hbase/data/.hbase-snapshot/.tmp/snapshot_table_157842
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|org.apache.hadoop.security.AccessControlException:
>  Permission denied:   user=hbase, access=WRITE, 
> inode="/apps/hbase/data/.hbase-snapshot/.tmp":hrt_qa:hadoop:drx-wT
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> check(FSPermissionChecker.java:399)
> 2018-06-01 09:22:54,249|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> checkPermission(FSPermissionChecker.java:255)
> {code}
> It turned out that the exception from {{setOwner}} call in the finally block 
> eclipsed the real exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20668) Exception from FileSystem operation in finally block of ExportSnapshot#doWork may hide exception from FileUtil.copy call

2018-06-01 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498191#comment-16498191
 ] 

Josh Elser commented on HBASE-20668:


{quote}One solution I can think of is to create .hbase-snapshot and 
.hbase-snapshot/.tmp directories upon cluster startup.
{quote}
I don't think we'd have to proactively create that directory up-front. It seems 
like it might just be an overly aggressive {{mkdir -p}} when we take the first 
snapshot. I can look if you're not certain now.

> Exception from FileSystem operation in finally block of ExportSnapshot#doWork 
> may hide exception from FileUtil.copy call
> 
>
> Key: HBASE-20668
> URL: https://issues.apache.org/jira/browse/HBASE-20668
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20668.v1.txt, 20668.v2.txt
>
>
> I was debugging the following error [~romil.choksi] saw during testing 
> ExportSnapshot :
> {code}
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|2018-06-01 02:40:52,358 
> ERROR [main] util.AbstractHBaseTool: Error  running command-line tool
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|java.io.FileNotFoundException:
>  Directory/File does not exist /apps/ 
> hbase/data/.hbase-snapshot/.tmp/snapshot_table_334546
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.  
> checkOwner(FSDirectory.java:1777)
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - 
> run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.  
> setOwner(FSDirAttrOp.java:82)
> {code}
> Here is corresponding code (with extra log added):
> {code}
> try {
>   LOG.info("Copy Snapshot Manifest from " + snapshotDir + " to " + 
> initialOutputSnapshotDir);
>   boolean ret = FileUtil.copy(inputFs, snapshotDir, outputFs, 
> initialOutputSnapshotDir, false,
>   false, conf);
>   LOG.info("return val = " + ret);
> } catch (IOException e) {
>   LOG.warn("Failed to copy the snapshot directory: from=" +
>   snapshotDir + " to=" + initialOutputSnapshotDir, e);
>   throw new ExportSnapshotException("Failed to copy the snapshot 
> directory: from=" +
> snapshotDir + " to=" + initialOutputSnapshotDir, e);
> } finally {
>   if (filesUser != null || filesGroup != null) {
> LOG.warn((filesUser == null ? "" : "Change the owner of " + 
> needSetOwnerDir + " to "
> + filesUser)
> + (filesGroup == null ? "" : ", Change the group of " + 
> needSetOwnerDir + " to "
> + filesGroup));
> setOwner(outputFs, needSetOwnerDir, filesUser, filesGroup, true);
>   }
> {code}
> "return val = " was not seen in rerun of the test.
> This is what the additional log revealed:
> {code}
> 2018-06-01 09:22:54,247|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|2018-06-01 09:22:54,241 WARN 
>  [main] snapshot.ExportSnapshot: Failed to copy the snapshot directory: 
> from=hdfs://ns1/apps/hbase/data/.hbase-snapshot/snapshot_table_157842 
> to=hdfs://ns3/apps/hbase/data/.hbase-snapshot/.tmp/snapshot_table_157842
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|org.apache.hadoop.security.AccessControlException:
>  Permission denied:   user=hbase, access=WRITE, 
> inode="/apps/hbase/data/.hbase-snapshot/.tmp":hrt_qa:hadoop:drx-wT
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> check(FSPermissionChecker.java:399)
> 2018-06-01 09:22:54,249|INFO|MainThread|machine.py:167 - 
> run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.  
> checkPermission(FSPermissionChecker.java:255)
> {code}
> It turned out that the exception from {{setOwner}} call in the finally block 
> eclipsed the real exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20669) [findbugs] autoboxing to parse primitive

2018-06-01 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498175#comment-16498175
 ] 

Wei-Chiu Chuang commented on HBASE-20669:
-

Silly me. Thanks Sean! Learned a new thing today. Quickly checked the hbase 
codebase (branch-1) and did not find any other statements of the same pattern 
(int.*Integer.valueOf)

> [findbugs] autoboxing to parse primitive
> 
>
> Key: HBASE-20669
> URL: https://issues.apache.org/jira/browse/HBASE-20669
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2, 1.5.0, 1.2.7, 1.4.3
>Reporter: Sean Busbey
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.5
>
> Attachments: HBASE-20669.branch-1.002.patch
>
>
> HBASE-18864 introduced a change that's been flagged by findbugs. example on 
> branch-1:
> {quote}
> FindBugs  module:hbase-client
> Boxing/unboxing to parse a primitive 
> org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[], byte[]) At 
> HColumnDescriptor.java:org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[],
>  byte[]) At HColumnDescriptor.java:[line 579]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20669) [findbugs] autoboxing to parse primitive

2018-06-01 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HBASE-20669:

Attachment: HBASE-20669.branch-1.002.patch

> [findbugs] autoboxing to parse primitive
> 
>
> Key: HBASE-20669
> URL: https://issues.apache.org/jira/browse/HBASE-20669
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2, 1.5.0, 1.2.7, 1.4.3
>Reporter: Sean Busbey
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.5
>
> Attachments: HBASE-20669.branch-1.002.patch
>
>
> HBASE-18864 introduced a change that's been flagged by findbugs. example on 
> branch-1:
> {quote}
> FindBugs  module:hbase-client
> Boxing/unboxing to parse a primitive 
> org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[], byte[]) At 
> HColumnDescriptor.java:org.apache.hadoop.hbase.HColumnDescriptor.setValue(byte[],
>  byte[]) At HColumnDescriptor.java:[line 579]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >