[jira] [Created] (HBASE-16824) Make replacement of path the first operation during WAL rotation

2016-10-12 Thread Atri Sharma (JIRA)
Atri Sharma created HBASE-16824:
---

 Summary: Make replacement of path the first operation during WAL 
rotation
 Key: HBASE-16824
 URL: https://issues.apache.org/jira/browse/HBASE-16824
 Project: HBase
  Issue Type: Bug
Reporter: Atri Sharma


In https://issues.apache.org/jira/browse/HBASE-12074, we hit an error if an 
async thread calls flush on a WAL record already closed as the WAL is being 
rotated. This JIRA investigates if setting the new WAL record path as the first 
operation during WAL rotation will fix the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16824) Make replacement of path the first operation during WAL rotation

2016-10-12 Thread Atri Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15571060#comment-15571060
 ] 

Atri Sharma commented on HBASE-16824:
-

[~shrikant] FYI please

> Make replacement of path the first operation during WAL rotation
> 
>
> Key: HBASE-16824
> URL: https://issues.apache.org/jira/browse/HBASE-16824
> Project: HBase
>  Issue Type: Bug
>Reporter: Atri Sharma
>
> In https://issues.apache.org/jira/browse/HBASE-12074, we hit an error if an 
> async thread calls flush on a WAL record already closed as the WAL is being 
> rotated. This JIRA investigates if setting the new WAL record path as the 
> first operation during WAL rotation will fix the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15921) Add first AsyncTable impl and create TableImpl based on it

2016-10-12 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15571057#comment-15571057
 ] 

Duo Zhang commented on HBASE-15921:
---

[~stack] [~enis] PTAL. Thanks.

> Add first AsyncTable impl and create TableImpl based on it
> --
>
> Key: HBASE-15921
> URL: https://issues.apache.org/jira/browse/HBASE-15921
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Jurriaan Mous
>Assignee: Duo Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-15921-v2.patch, HBASE-15921-v3.patch, 
> HBASE-15921-v4.patch, HBASE-15921-v5.patch, HBASE-15921-v6.patch, 
> HBASE-15921-v7.patch, HBASE-15921-v8.patch, HBASE-15921-v9.patch, 
> HBASE-15921.demo.patch, HBASE-15921.patch, HBASE-15921.v1.patch
>
>
> First we create an AsyncTable interface with implementation without the Scan 
> functionality. Those will land in a separate patch since they need a refactor 
> of existing scans.
> Also added is a new TableImpl to replace HTable. It uses the AsyncTableImpl 
> internally and should be a bit faster because it does jump through less hoops 
> to do ProtoBuf transportation. This way we can run all existing tests on the 
> AsyncTableImpl to guarantee its quality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12074) TestLogRollingNoCluster#testContendedLogRolling() failed

2016-10-12 Thread Atri Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15571053#comment-15571053
 ] 

Atri Sharma commented on HBASE-12074:
-

Could a possible fix be to make rollWriter get the zig-zag latch and call 
doReplaceWriter as the first operation, before attempting to close and flush 
the log files? This will lead new HLog Writer threads to see the newPath 
already set and not wait for the flush to happen, and the old file cleanup can 
happen as a background thread.

> TestLogRollingNoCluster#testContendedLogRolling() failed
> 
>
> Key: HBASE-12074
> URL: https://issues.apache.org/jira/browse/HBASE-12074
> Project: HBase
>  Issue Type: Bug
>Reporter: Enis Soztutar
>Assignee: Stephen Yuan Jiang
>
> TestLogRollingNoCluster#testContendedLogRolling() failed on a 0.98 run. I am 
> trying to understand the context. 
> The failure is this: 
> {code}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at org.junit.Assert.assertFalse(Assert.java:74)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.TestLogRollingNoCluster.testContendedLogRolling(TestLogRollingNoCluster.java:80)
> {code}
> Caused because one of the Appenders calling FSHLog.sync() threw IOE because 
> of concurrent close: 
> {code}
> 4-09-23 16:36:39,530 FATAL [pool-1-thread-1-WAL.AsyncSyncer0] 
> wal.FSHLog$AsyncSyncer(1246): Error while AsyncSyncer sync, request close of 
> hlog 
> java.io.IOException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:168)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1241)
>   at java.lang.Thread.run(Thread.java:722)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:165)
>   ... 2 more
> 2014-09-23 16:36:39,531 INFO  [32] wal.TestLogRollingNoCluster$Appender(137): 
> Caught exception from Appender:32
> java.io.IOException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:168)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1241)
>   at java.lang.Thread.run(Thread.java:722)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:165)
>   ... 2 more
> 2014-09-23 16:36:39,532 INFO  [19] wal.TestLogRollingNoCluster$Appender(137): 
> Caught exception from Appender:19
> java.io.IOException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:168)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1241)
>   at java.lang.Thread.run(Thread.java:722)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:165)
>   ... 2 more
> {code}
> The code is: 
> {code}
>   public void sync() throws IOException {
> try {
>   this.output.flush();
>   this.output.sync();
> } catch (NullPointerException npe) {
>   // Concurrent close...
>   throw new IOException(npe);
> }
>   }
> {code}
> I think the test case written exactly to catch this case: 
> {code}
>* Spin up a bunch of threads and have them all append to a WAL.  Roll the
>* WAL frequently to try and trigger NPE.
> {code}
> This is why I am reporting since I don't have much context. It may not be a 
> test issue, but an actual bug. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16414) Improve performance for RPC encryption with Apache Common Crypto

2016-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15571042#comment-15571042
 ] 

Hadoop QA commented on HBASE-16414:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
6s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 2s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
34s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
46s {color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s 
{color} | {color:blue} Skipped patched modules with no Java source: . {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 49s 
{color} | {color:red} hbase-protocol-shaded in master has 22 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 48s 
{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 11s 
{color} | {color:red} hbase-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 20s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
35s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
47s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 15 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
27m 2s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s 
{color} | {color:blue} Skipped patched modules with no Java source: . {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 39s 
{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s 
{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 54s 
{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 36s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 26s {color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 
3s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 241m 35s {color} 
| {color:black} {color} |
\\

[jira] [Commented] (HBASE-16810) HBase Balancer throws ArrayIndexOutOfBoundsException when regionservers are in /hbase/draining znode and unloaded

2016-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15571032#comment-15571032
 ] 

Hadoop QA commented on HBASE-16810:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
44s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
5s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
20s {color} | {color:green} branch-1 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 22s 
{color} | {color:red} hbase-server in branch-1 has 1 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s 
{color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
58s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
19m 1s {color} | {color:green} The patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 91m 42s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 127m 47s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestClusterId |
| Timed out junit tests | 
org.apache.hadoop.hbase.replication.TestPerTableCFReplication |
|   | org.apache.hadoop.hbase.filter.TestFuzzyRowFilterEndToEnd |
|   | org.apache.hadoop.hbase.tool.TestCanaryTool |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:b2c5d84 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12833039/HBASE-16810.branch-1.v2.patch
 |
| JIRA Issue | HBASE-16810 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux c0ccd5d961c4 3.1

[jira] [Updated] (HBASE-15921) Add first AsyncTable impl and create TableImpl based on it

2016-10-12 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-15921:
--
Attachment: HBASE-15921-v9.patch

Remove curator dependency as we only use a simple blocking get in this patch. 
Will open new issue to discuss the zk implementation at client side.

> Add first AsyncTable impl and create TableImpl based on it
> --
>
> Key: HBASE-15921
> URL: https://issues.apache.org/jira/browse/HBASE-15921
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Jurriaan Mous
>Assignee: Duo Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-15921-v2.patch, HBASE-15921-v3.patch, 
> HBASE-15921-v4.patch, HBASE-15921-v5.patch, HBASE-15921-v6.patch, 
> HBASE-15921-v7.patch, HBASE-15921-v8.patch, HBASE-15921-v9.patch, 
> HBASE-15921.demo.patch, HBASE-15921.patch, HBASE-15921.v1.patch
>
>
> First we create an AsyncTable interface with implementation without the Scan 
> functionality. Those will land in a separate patch since they need a refactor 
> of existing scans.
> Also added is a new TableImpl to replace HTable. It uses the AsyncTableImpl 
> internally and should be a bit faster because it does jump through less hoops 
> to do ProtoBuf transportation. This way we can run all existing tests on the 
> AsyncTableImpl to guarantee its quality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16653) Backport HBASE-11393 to all branches which support namespace

2016-10-12 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15571031#comment-15571031
 ] 

Ashish Singhi commented on HBASE-16653:
---

bq. ReplicationPeerConfig class is InterfaceAudience.Public and v3 patch added 
two methods for it. Did this has compatibility issues?
This should be ok.

> Backport HBASE-11393 to all branches which support namespace
> 
>
> Key: HBASE-16653
> URL: https://issues.apache.org/jira/browse/HBASE-16653
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0, 1.0.5, 1.3.1, 0.98.22, 1.1.7, 1.2.4
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 1.4.0
>
> Attachments: HBASE-16653-branch-1-v1.patch, 
> HBASE-16653-branch-1-v2.patch, HBASE-16653-branch-1-v3.patch
>
>
> As HBASE-11386 mentioned, the parse code about replication table-cfs config 
> will be wrong when table name contains namespace and we can only config the 
> default namespace's tables in the peer. It is a bug for all branches which 
> support namespace. HBASE-11393 resolved this by use a pb object but it was 
> only merged to master branch. Other branches still have this problem. I 
> thought we should fix this bug in all branches which support namespace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16792) Reuse KeyValue.KeyOnlyKeyValue in BufferedDataBlockEncoder.SeekerState

2016-10-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15571028#comment-15571028
 ] 

Hudson commented on HBASE-16792:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #1777 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/1777/])
HBASE-16792 Reuse KeyValue.KeyOnlyKeyValue in (ramkrishna: rev 
f11aa4542f8f5489823fb72d1e9bc98e5cc6d742)
* (edit) hbase-common/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* (edit) 
hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java


> Reuse KeyValue.KeyOnlyKeyValue in BufferedDataBlockEncoder.SeekerState
> --
>
> Key: HBASE-16792
> URL: https://issues.apache.org/jira/browse/HBASE-16792
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: binlijin
>Assignee: binlijin
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-16792_master.patch
>
>
> When every SeekerState#invalidate will new a fresh KeyValue.KeyOnlyKeyValue, 
> we should reuse it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16724) Snapshot owner can't clone

2016-10-12 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15571026#comment-15571026
 ] 

Ashish Singhi commented on HBASE-16724:
---

+1. Will commit this shortly.
Can you attach branch-1 patch if the master branch patch doesn't apply ?

> Snapshot owner can't clone
> --
>
> Key: HBASE-16724
> URL: https://issues.apache.org/jira/browse/HBASE-16724
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
> Attachments: HBASE-16724-V2.patch, HBASE-16724-V3.patch, 
> HBASE-16724.patch
>
>
> Currently only Global admin has the access of cloning a snapshot.
> In AccessController,
> {code}
>   @Override
>   public void preCloneSnapshot(final 
> ObserverContext ctx,
>   final SnapshotDescription snapshot, final HTableDescriptor 
> hTableDescriptor)
>   throws IOException {
> requirePermission(getActiveUser(ctx), "cloneSnapshot " + 
> snapshot.getName(), Action.ADMIN);
>   }
> {code}
> Snapshot owner should be able to  clone it, need to add a check like,
> {code}
> SnapshotDescriptionUtils.isSnapshotOwner(snapshot, user)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16807) RegionServer will fail to report new active Hmaster until HMaster/RegionServer failover

2016-10-12 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15571018#comment-15571018
 ] 

Ashish Singhi commented on HBASE-16807:
---

Test failures doesn't seems to be related to this patch.
+1

> RegionServer will fail to report new active Hmaster until 
> HMaster/RegionServer failover
> ---
>
> Key: HBASE-16807
> URL: https://issues.apache.org/jira/browse/HBASE-16807
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16807.patch
>
>
> It's little weird, but it happened in the product environment that few 
> RegionServer missed master znode create notification on master failover. In 
> that case ZooKeeperNodeTracker will not refresh the cached data and 
> MasterAddressTracker will always return old active HM detail to Region server 
> on ServiceException.
> Though We create region server stub on failure but without refreshing the 
> MasterAddressTracker data.
> In HRegionServer.createRegionServerStatusStub()
> {code}
>   boolean refresh = false; // for the first time, use cached data
> RegionServerStatusService.BlockingInterface intf = null;
> boolean interrupted = false;
> try {
>   while (keepLooping()) {
> sn = this.masterAddressTracker.getMasterAddress(refresh);
> if (sn == null) {
>   if (!keepLooping()) {
> // give up with no connection.
> LOG.debug("No master found and cluster is stopped; bailing out");
> return null;
>   }
>   if (System.currentTimeMillis() > (previousLogTime + 1000)) {
> LOG.debug("No master found; retry");
> previousLogTime = System.currentTimeMillis();
>   }
>   refresh = true; // let's try pull it from ZK directly
>   if (sleep(200)) {
> interrupted = true;
>   }
>   continue;
> }
> {code}
> Here we refresh node only when 'sn' is NULL otherwise it will use same cached 
> data. 
> So in above case RegionServer will never report active HMaster successfully 
> until HMaster failover or RegionServer restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16716) OfflineMetaRepair leaves empty directory inside /hbase/WALs which remains forever

2016-10-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570980#comment-15570980
 ] 

Hudson commented on HBASE-16716:


FAILURE: Integrated in Jenkins build HBase-1.4 #465 (See 
[https://builds.apache.org/job/HBase-1.4/465/])
HBASE-16716 OfflineMetaRepair leaves empty directory inside /hbase/WALs (tedyu: 
rev e2278f9544380ec7abc92c3592bbe2068e62cb45)
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java


> OfflineMetaRepair leaves empty directory inside /hbase/WALs which remains 
> forever
> -
>
> Key: HBASE-16716
> URL: https://issues.apache.org/jira/browse/HBASE-16716
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 2.0.0
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16716-V2.patch, HBASE-16716-branch-1.patch, 
> HBASE-16716.patch
>
>
> OfflineMetaRepair rebuild Meta table, while creating meta region it creates 
> it's own WAL (inside /hbase/WALs/hbck-meta-recovery-) which wll 
> be closed and archived after rebuilding Meta. 
> {noformat}
> hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
> >> /hbase/WALs/hbck-meta-recovery-
> {noformat}
> It doesn't clear the empty dir, empty directory should be removed after 
> success.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16816) HMaster.move() should throw exception if region to move is not online

2016-10-12 Thread Allan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570956#comment-15570956
 ] 

Allan Yang commented on HBASE-16816:


Yes, they all passed locally 

> HMaster.move() should throw exception if region to move is not online
> -
>
> Key: HBASE-16816
> URL: https://issues.apache.org/jira/browse/HBASE-16816
> Project: HBase
>  Issue Type: Bug
>  Components: Admin
>Affects Versions: 1.1.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Minor
> Fix For: 1.4.0
>
> Attachments: HBASE-16816-branch-1-v2.patch, 
> HBASE-16816-branch-1-v3.patch, HBASE-16816-branch-1.patch
>
>
> The move region function in HMaster only checks whether the region to move 
> exists
> {code}
> if (regionState == null) {
>   throw new 
> UnknownRegionException(Bytes.toStringBinary(encodedRegionName));
> }
> {code}
> It will not return anything if the region is split or in transition which is 
> not movable. So the caller has no way to know if the move region operation is 
> failed.
> It is a problem for "region_move.rb". It only gives up moving a region if a 
> exception is thrown.Otherwise, it will wait until a timeout and retry. 
> Without a exception, it have no idea the region is not movable.
> {code}
> begin
>   admin.move(Bytes.toBytes(r.getEncodedName()), Bytes.toBytes(newServer))
> rescue java.lang.reflect.UndeclaredThrowableException,
> org.apache.hadoop.hbase.UnknownRegionException => e
>   $LOG.info("Exception moving "  + r.getEncodedName() +
> "; split/moved? Continuing: " + e)
>   return
> end
>  # Wait till its up on new server before moving on
> maxWaitInSeconds = admin.getConfiguration.getInt("hbase.move.wait.max", 
> 60)
> maxWait = Time.now + maxWaitInSeconds
> while Time.now < maxWait
>   same = isSameServer(admin, r, original)
>   break unless same
>   sleep 0.1
> end
>   end
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16642) Use DelayQueue instead of TimeoutBlockingQueue

2016-10-12 Thread Hiroshi Ikeda (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570951#comment-15570951
 ] 

Hiroshi Ikeda commented on HBASE-16642:
---

You got what I want to say, but don't synchronize per comparison.

It would require magnificent mechanism and execution cost to make a queue 
enable to dynamically re-order its elements. In other words, in general a queue 
doesn't support such a mechanism and the priory of an element should not be 
changed while the element is stored in the queue, otherwise the queue will do 
unexpected behavior.

That means, synchronizing per comparison just wastes at the best. 
Synchronization requires a bit cost and prevents runtime optimization, and even 
though that might be ignorable there is no reason to pay for nothing.

> Use DelayQueue instead of TimeoutBlockingQueue
> --
>
> Key: HBASE-16642
> URL: https://issues.apache.org/jira/browse/HBASE-16642
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Reporter: Hiroshi Ikeda
>Assignee: Matteo Bertozzi
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-16642-v2.patch, HBASE-16642-v3.patch, 
> HBASE-16642.master.V1.patch
>
>
> Enqueue poisons in order to wake up and end the internal threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16807) RegionServer will fail to report new active Hmaster until HMaster/RegionServer failover

2016-10-12 Thread Pankaj Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570911#comment-15570911
 ] 

Pankaj Kumar commented on HBASE-16807:
--

Thanks [~chenheng] for reviewing the patch.

> RegionServer will fail to report new active Hmaster until 
> HMaster/RegionServer failover
> ---
>
> Key: HBASE-16807
> URL: https://issues.apache.org/jira/browse/HBASE-16807
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16807.patch
>
>
> It's little weird, but it happened in the product environment that few 
> RegionServer missed master znode create notification on master failover. In 
> that case ZooKeeperNodeTracker will not refresh the cached data and 
> MasterAddressTracker will always return old active HM detail to Region server 
> on ServiceException.
> Though We create region server stub on failure but without refreshing the 
> MasterAddressTracker data.
> In HRegionServer.createRegionServerStatusStub()
> {code}
>   boolean refresh = false; // for the first time, use cached data
> RegionServerStatusService.BlockingInterface intf = null;
> boolean interrupted = false;
> try {
>   while (keepLooping()) {
> sn = this.masterAddressTracker.getMasterAddress(refresh);
> if (sn == null) {
>   if (!keepLooping()) {
> // give up with no connection.
> LOG.debug("No master found and cluster is stopped; bailing out");
> return null;
>   }
>   if (System.currentTimeMillis() > (previousLogTime + 1000)) {
> LOG.debug("No master found; retry");
> previousLogTime = System.currentTimeMillis();
>   }
>   refresh = true; // let's try pull it from ZK directly
>   if (sleep(200)) {
> interrupted = true;
>   }
>   continue;
> }
> {code}
> Here we refresh node only when 'sn' is NULL otherwise it will use same cached 
> data. 
> So in above case RegionServer will never report active HMaster successfully 
> until HMaster failover or RegionServer restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-16641) QA tests for hbase-client skip the second part.

2016-10-12 Thread Heng Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heng Chen resolved HBASE-16641.
---
Resolution: Duplicate

> QA tests for hbase-client skip the second part.
> ---
>
> Key: HBASE-16641
> URL: https://issues.apache.org/jira/browse/HBASE-16641
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>
> See 
> https://builds.apache.org/job/PreCommit-HBASE-Build/3547/artifact/patchprocess/patch-unit-hbase-client.txt
> {code}
> [INFO] --- maven-surefire-plugin:2.18.1:test (secondPartTestsExecution) @ 
> hbase-client ---
> [INFO] Tests are skipped.
> {code}
> The first part passed fine,  but second parts is skipped. 
> Notice hbase-client/pom.xml 
> {code}
>  
>   
> secondPartTestsExecution
> test
> 
>   test
> 
> 
>   true
> 
>   
> 
> {code}
> If i change the 'skip' to be false,  the second part could be triggered.  But 
> this configuration existed for a long time,  is the cmd line on build box 
> updated recently? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16641) QA tests for hbase-client skip the second part.

2016-10-12 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570896#comment-15570896
 ] 

Heng Chen commented on HBASE-16641:
---

It seems to be fixed.  But QA in HBASE-16785 seems to be aborted at 
hbase-procedure.
As the issue is part of HBASE-16785,  let me resolve it as duplicate. 

> QA tests for hbase-client skip the second part.
> ---
>
> Key: HBASE-16641
> URL: https://issues.apache.org/jira/browse/HBASE-16641
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>
> See 
> https://builds.apache.org/job/PreCommit-HBASE-Build/3547/artifact/patchprocess/patch-unit-hbase-client.txt
> {code}
> [INFO] --- maven-surefire-plugin:2.18.1:test (secondPartTestsExecution) @ 
> hbase-client ---
> [INFO] Tests are skipped.
> {code}
> The first part passed fine,  but second parts is skipped. 
> Notice hbase-client/pom.xml 
> {code}
>  
>   
> secondPartTestsExecution
> test
> 
>   test
> 
> 
>   true
> 
>   
> 
> {code}
> If i change the 'skip' to be false,  the second part could be triggered.  But 
> this configuration existed for a long time,  is the cmd line on build box 
> updated recently? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16807) RegionServer will fail to report new active Hmaster until HMaster/RegionServer failover

2016-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570885#comment-15570885
 ] 

Hadoop QA commented on HBASE-16807:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 
17s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
49s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 1s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
31m 20s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 15s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 131m 49s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | 
org.apache.hadoop.hbase.snapshot.TestSnapshotClientRetries |
|   | org.apache.hadoop.hbase.snapshot.TestExportSnapshot |
|   | org.apache.hadoop.hbase.client.TestHCM |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12833026/HBASE-16807.patch |
| JIRA Issue | HBASE-16807 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 295de781389b 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 92ef234 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/3977/artifact/patchprocess/patch-unit-hbase-server.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HBASE-Build/3977/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/3977/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/3977/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> RegionServer will fail to report new acti

[jira] [Commented] (HBASE-16814) FuzzyRowFilter causes remote call timeout

2016-10-12 Thread Hadi Kahraman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570881#comment-15570881
 ] 

Hadi Kahraman commented on HBASE-16814:
---

Well, maybe a mutual bug. I don't know. Test summary:

PASS: client=1.1.4 server=1.2.0
PASS: client=1.2.0 server=1.2.0
PASS: client=1.2.1 server=1.2.0
FAIL: client=1.2.2 server=1.2.0
FAIL: client=1.2.3 server=1.2.0

> FuzzyRowFilter causes remote call timeout
> -
>
> Key: HBASE-16814
> URL: https://issues.apache.org/jira/browse/HBASE-16814
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.2.2, 1.2.3
> Environment: LinuxMint 17.3 (=Ubuntu 14.04), Java 1.8
>Reporter: Hadi Kahraman
>
> FuzzyRowFilter causes ResultScanner.next hang and timeout. The same code 
> works well on hbase 1.2.1, 1.2.0, 1.1.4.
> hbase server: cloudera 5.7.0 (hbase 1.2.0) on 4 hosts, 1 master, 3 workers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16823) Add examples in HBase Spark

2016-10-12 Thread Weiqing Yang (JIRA)
Weiqing Yang created HBASE-16823:


 Summary: Add examples in HBase Spark
 Key: HBASE-16823
 URL: https://issues.apache.org/jira/browse/HBASE-16823
 Project: HBase
  Issue Type: Improvement
  Components: spark
Reporter: Weiqing Yang
Assignee: Weiqing Yang


This patch adds examples that show how to use Spark Dataframe to read and write 
Hbase tables. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16744) Procedure V2 - Lock procedures to allow clients to acquire locks on tables/namespaces/regions

2016-10-12 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570860#comment-15570860
 ] 

Appy commented on HBASE-16744:
--

Uploaded the latest patch. Let's get it in since HBASE-16786 is also ready.

> Procedure V2 - Lock procedures to allow clients to acquire locks on 
> tables/namespaces/regions
> -
>
> Key: HBASE-16744
> URL: https://issues.apache.org/jira/browse/HBASE-16744
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Appy
>Assignee: Appy
> Attachments: HBASE-16744.master.001.patch, 
> HBASE-16744.master.002.patch, HBASE-16744.master.003.patch, 
> HBASE-16744.master.004.patch, HBASE-16744.master.005.patch, 
> HBASE-16744.master.006.patch
>
>
> Will help us get rid of ZK locks.
> Will be useful for external tools like hbck, future backup manager, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16744) Procedure V2 - Lock procedures to allow clients to acquire locks on tables/namespaces/regions

2016-10-12 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-16744:
-
Attachment: HBASE-16744.master.006.patch

> Procedure V2 - Lock procedures to allow clients to acquire locks on 
> tables/namespaces/regions
> -
>
> Key: HBASE-16744
> URL: https://issues.apache.org/jira/browse/HBASE-16744
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Appy
>Assignee: Appy
> Attachments: HBASE-16744.master.001.patch, 
> HBASE-16744.master.002.patch, HBASE-16744.master.003.patch, 
> HBASE-16744.master.004.patch, HBASE-16744.master.005.patch, 
> HBASE-16744.master.006.patch
>
>
> Will help us get rid of ZK locks.
> Will be useful for external tools like hbck, future backup manager, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16807) RegionServer will fail to report new active Hmaster until HMaster/RegionServer failover

2016-10-12 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570846#comment-15570846
 ] 

Heng Chen commented on HBASE-16807:
---

Got it. +1 for it. 

> RegionServer will fail to report new active Hmaster until 
> HMaster/RegionServer failover
> ---
>
> Key: HBASE-16807
> URL: https://issues.apache.org/jira/browse/HBASE-16807
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16807.patch
>
>
> It's little weird, but it happened in the product environment that few 
> RegionServer missed master znode create notification on master failover. In 
> that case ZooKeeperNodeTracker will not refresh the cached data and 
> MasterAddressTracker will always return old active HM detail to Region server 
> on ServiceException.
> Though We create region server stub on failure but without refreshing the 
> MasterAddressTracker data.
> In HRegionServer.createRegionServerStatusStub()
> {code}
>   boolean refresh = false; // for the first time, use cached data
> RegionServerStatusService.BlockingInterface intf = null;
> boolean interrupted = false;
> try {
>   while (keepLooping()) {
> sn = this.masterAddressTracker.getMasterAddress(refresh);
> if (sn == null) {
>   if (!keepLooping()) {
> // give up with no connection.
> LOG.debug("No master found and cluster is stopped; bailing out");
> return null;
>   }
>   if (System.currentTimeMillis() > (previousLogTime + 1000)) {
> LOG.debug("No master found; retry");
> previousLogTime = System.currentTimeMillis();
>   }
>   refresh = true; // let's try pull it from ZK directly
>   if (sleep(200)) {
> interrupted = true;
>   }
>   continue;
> }
> {code}
> Here we refresh node only when 'sn' is NULL otherwise it will use same cached 
> data. 
> So in above case RegionServer will never report active HMaster successfully 
> until HMaster failover or RegionServer restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16816) HMaster.move() should throw exception if region to move is not online

2016-10-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570842#comment-15570842
 ] 

Ted Yu commented on HBASE-16816:


Do the above tests pass with patch locally ?

> HMaster.move() should throw exception if region to move is not online
> -
>
> Key: HBASE-16816
> URL: https://issues.apache.org/jira/browse/HBASE-16816
> Project: HBase
>  Issue Type: Bug
>  Components: Admin
>Affects Versions: 1.1.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Minor
> Fix For: 1.4.0
>
> Attachments: HBASE-16816-branch-1-v2.patch, 
> HBASE-16816-branch-1-v3.patch, HBASE-16816-branch-1.patch
>
>
> The move region function in HMaster only checks whether the region to move 
> exists
> {code}
> if (regionState == null) {
>   throw new 
> UnknownRegionException(Bytes.toStringBinary(encodedRegionName));
> }
> {code}
> It will not return anything if the region is split or in transition which is 
> not movable. So the caller has no way to know if the move region operation is 
> failed.
> It is a problem for "region_move.rb". It only gives up moving a region if a 
> exception is thrown.Otherwise, it will wait until a timeout and retry. 
> Without a exception, it have no idea the region is not movable.
> {code}
> begin
>   admin.move(Bytes.toBytes(r.getEncodedName()), Bytes.toBytes(newServer))
> rescue java.lang.reflect.UndeclaredThrowableException,
> org.apache.hadoop.hbase.UnknownRegionException => e
>   $LOG.info("Exception moving "  + r.getEncodedName() +
> "; split/moved? Continuing: " + e)
>   return
> end
>  # Wait till its up on new server before moving on
> maxWaitInSeconds = admin.getConfiguration.getInt("hbase.move.wait.max", 
> 60)
> maxWait = Time.now + maxWaitInSeconds
> while Time.now < maxWait
>   same = isSameServer(admin, r, original)
>   break unless same
>   sleep 0.1
> end
>   end
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16807) RegionServer will fail to report new active Hmaster until HMaster/RegionServer failover

2016-10-12 Thread Pankaj Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570844#comment-15570844
 ] 

Pankaj Kumar commented on HBASE-16807:
--

On ServiceException, instead of using cached data it will refresh the master 
address znode from ZK and create the RS stub.

> RegionServer will fail to report new active Hmaster until 
> HMaster/RegionServer failover
> ---
>
> Key: HBASE-16807
> URL: https://issues.apache.org/jira/browse/HBASE-16807
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16807.patch
>
>
> It's little weird, but it happened in the product environment that few 
> RegionServer missed master znode create notification on master failover. In 
> that case ZooKeeperNodeTracker will not refresh the cached data and 
> MasterAddressTracker will always return old active HM detail to Region server 
> on ServiceException.
> Though We create region server stub on failure but without refreshing the 
> MasterAddressTracker data.
> In HRegionServer.createRegionServerStatusStub()
> {code}
>   boolean refresh = false; // for the first time, use cached data
> RegionServerStatusService.BlockingInterface intf = null;
> boolean interrupted = false;
> try {
>   while (keepLooping()) {
> sn = this.masterAddressTracker.getMasterAddress(refresh);
> if (sn == null) {
>   if (!keepLooping()) {
> // give up with no connection.
> LOG.debug("No master found and cluster is stopped; bailing out");
> return null;
>   }
>   if (System.currentTimeMillis() > (previousLogTime + 1000)) {
> LOG.debug("No master found; retry");
> previousLogTime = System.currentTimeMillis();
>   }
>   refresh = true; // let's try pull it from ZK directly
>   if (sleep(200)) {
> interrupted = true;
>   }
>   continue;
> }
> {code}
> Here we refresh node only when 'sn' is NULL otherwise it will use same cached 
> data. 
> So in above case RegionServer will never report active HMaster successfully 
> until HMaster failover or RegionServer restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload

2016-10-12 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570840#comment-15570840
 ] 

Heng Chen commented on HBASE-16698:
---

Numbers seems 20 regions on one RS.  If you have time,  please upload numbers 
one region on one RS.  I am very inerested about it.  As [~stack] said,  set it 
to off as default is good for me.

BTW.  The patch lgtm. +1 for it.  

> Performance issue: handlers stuck waiting for CountDownLatch inside 
> WALKey#getWriteEntry under high writing workload
> 
>
> Key: HBASE-16698
> URL: https://issues.apache.org/jira/browse/HBASE-16698
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Affects Versions: 1.1.6, 1.2.3
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0, 1.4.0
>
> Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, 
> HBASE-16698.v2.patch, hadoop0495.et2.jstack
>
>
> As titled, on our production environment we observed 98 out of 128 handlers 
> get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside 
> {{WALKey#getWriteEntry}} under a high writing workload.
> After digging into the problem, we found that the problem is mainly caused by 
> advancing mvcc in the append logic. Below is some detailed analysis:
> Under current branch-1 code logic, all batch puts will call 
> {{WALKey#getWriteEntry}} after appending edit to WAL, and 
> {{seqNumAssignedLatch}} is only released when the relative append call is 
> handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). 
> Because currently we're using a single event handler for the ringbuffer, the 
> append calls are handled one by one (actually lot's of our current logic 
> depending on this sequential dealing logic), and this becomes a bottleneck 
> under high writing workload.
> The worst part is that by default we only use one WAL per RS, so appends on 
> all regions are dealt with in sequential, which causes contention among 
> different regions...
> To fix this, we could also take use of the "sequential appends" mechanism, 
> that we could grab the WriteEntry before publishing append onto ringbuffer 
> and use it as sequence id, only that we need to add a lock to make "grab 
> WriteEntry" and "append edit" a transaction. This will still cause contention 
> inside a region but could avoid contention between different regions. This 
> solution is already verified in our online environment and proved to be 
> effective.
> Notice that for master (2.0) branch since we already change the write 
> pipeline to sync before writing memstore (HBASE-15158), this issue only 
> exists for the ASYNC_WAL writes scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16814) FuzzyRowFilter causes remote call timeout

2016-10-12 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570833#comment-15570833
 ] 

ramkrishna.s.vasudevan commented on HBASE-16814:


Oh, you mean this is a client side bug? But the scan is throwing timeout 
exception. 

> FuzzyRowFilter causes remote call timeout
> -
>
> Key: HBASE-16814
> URL: https://issues.apache.org/jira/browse/HBASE-16814
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.2.2, 1.2.3
> Environment: LinuxMint 17.3 (=Ubuntu 14.04), Java 1.8
>Reporter: Hadi Kahraman
>
> FuzzyRowFilter causes ResultScanner.next hang and timeout. The same code 
> works well on hbase 1.2.1, 1.2.0, 1.1.4.
> hbase server: cloudera 5.7.0 (hbase 1.2.0) on 4 hosts, 1 master, 3 workers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16578) Mob data loss after mob compaction and normal compcation

2016-10-12 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570831#comment-15570831
 ] 

ramkrishna.s.vasudevan commented on HBASE-16578:


+1.

> Mob data loss after mob compaction and normal compcation
> 
>
> Key: HBASE-16578
> URL: https://issues.apache.org/jira/browse/HBASE-16578
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.0.0
>Reporter: huaxiang sun
>Assignee: Jingcheng Du
> Attachments: HBASE-16578-V2.patch, HBASE-16578.patch, 
> TestMobCompaction.java, TestMobCompaction.java
>
>
> StoreFileScanners on MOB cells rely on the scannerOrder to find the latest 
> cells after mob compaction. The value of scannerOrder is assigned by the 
> order of maxSeqId of StoreFile, and this maxSeqId is valued only after the 
> reader of the StoreFile is created.
> In {{Compactor.compact}}, the compacted store files are cloned and their 
> readers are not created. And in {{StoreFileScanner.getScannersForStoreFiles}} 
> the StoreFiles are sorted before the readers are created and at that time the 
> maxSeqId for each file is -1 (the default value). This will lead  to a chaos 
> in scanners in the following normal compaction. Some older cells might be 
> chosen during the normal compaction.
> We need to create readers either before the sorting in the method 
> {{StoreFileScanner.getScannersForStoreFiles}}, or create readers just after 
> the store files are cloned in {{Compactor.compact}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16807) RegionServer will fail to report new active Hmaster until HMaster/RegionServer failover

2016-10-12 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570829#comment-15570829
 ] 

Heng Chen commented on HBASE-16807:
---

You patch seems just skip the cache without consider ServiceException, right? 

> RegionServer will fail to report new active Hmaster until 
> HMaster/RegionServer failover
> ---
>
> Key: HBASE-16807
> URL: https://issues.apache.org/jira/browse/HBASE-16807
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16807.patch
>
>
> It's little weird, but it happened in the product environment that few 
> RegionServer missed master znode create notification on master failover. In 
> that case ZooKeeperNodeTracker will not refresh the cached data and 
> MasterAddressTracker will always return old active HM detail to Region server 
> on ServiceException.
> Though We create region server stub on failure but without refreshing the 
> MasterAddressTracker data.
> In HRegionServer.createRegionServerStatusStub()
> {code}
>   boolean refresh = false; // for the first time, use cached data
> RegionServerStatusService.BlockingInterface intf = null;
> boolean interrupted = false;
> try {
>   while (keepLooping()) {
> sn = this.masterAddressTracker.getMasterAddress(refresh);
> if (sn == null) {
>   if (!keepLooping()) {
> // give up with no connection.
> LOG.debug("No master found and cluster is stopped; bailing out");
> return null;
>   }
>   if (System.currentTimeMillis() > (previousLogTime + 1000)) {
> LOG.debug("No master found; retry");
> previousLogTime = System.currentTimeMillis();
>   }
>   refresh = true; // let's try pull it from ZK directly
>   if (sleep(200)) {
> interrupted = true;
>   }
>   continue;
> }
> {code}
> Here we refresh node only when 'sn' is NULL otherwise it will use same cached 
> data. 
> So in above case RegionServer will never report active HMaster successfully 
> until HMaster failover or RegionServer restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16752) Upgrading from 1.2 to 1.3 can lead to replication failures due to difference in RPC size limit

2016-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570828#comment-15570828
 ] 

Hadoop QA commented on HBASE-16752:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
6s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
42s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
28s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s 
{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
3s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
26m 43s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
38s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 53s 
{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 152m 35s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
24s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 196m 35s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestHRegion |
| Timed out junit tests | org.apache.hadoop.hbase.client.TestReplicasClient |
|   | org.apache.hadoop.hbase.client.TestMetaWithReplicas |
|   | org.apache.hadoop.hbase.client.TestEnableTable |
|   | org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClient |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12833011/HBASE-16752.V2.patch |
| JIRA Issue | HBASE-16752 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 2a64ca92b9dd 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 92ef234 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HBAS

[jira] [Commented] (HBASE-16653) Backport HBASE-11393 to all branches which support namespace

2016-10-12 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570822#comment-15570822
 ] 

Heng Chen commented on HBASE-16653:
---

Let me take a look of patch v3.  Need some time.

> Backport HBASE-11393 to all branches which support namespace
> 
>
> Key: HBASE-16653
> URL: https://issues.apache.org/jira/browse/HBASE-16653
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0, 1.0.5, 1.3.1, 0.98.22, 1.1.7, 1.2.4
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 1.4.0
>
> Attachments: HBASE-16653-branch-1-v1.patch, 
> HBASE-16653-branch-1-v2.patch, HBASE-16653-branch-1-v3.patch
>
>
> As HBASE-11386 mentioned, the parse code about replication table-cfs config 
> will be wrong when table name contains namespace and we can only config the 
> default namespace's tables in the peer. It is a bug for all branches which 
> support namespace. HBASE-11393 resolved this by use a pb object but it was 
> only merged to master branch. Other branches still have this problem. I 
> thought we should fix this bug in all branches which support namespace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16792) Reuse KeyValue.KeyOnlyKeyValue in BufferedDataBlockEncoder.SeekerState

2016-10-12 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-16792:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks for the patch [~aoxiang]. Thanks for the review 
[~saint@gmail.com].

> Reuse KeyValue.KeyOnlyKeyValue in BufferedDataBlockEncoder.SeekerState
> --
>
> Key: HBASE-16792
> URL: https://issues.apache.org/jira/browse/HBASE-16792
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: binlijin
>Assignee: binlijin
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-16792_master.patch
>
>
> When every SeekerState#invalidate will new a fresh KeyValue.KeyOnlyKeyValue, 
> we should reuse it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16792) Reuse KeyValue.KeyOnlyKeyValue in BufferedDataBlockEncoder.SeekerState

2016-10-12 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570816#comment-15570816
 ] 

ramkrishna.s.vasudevan commented on HBASE-16792:


+1. LGTM. Shall commit this code to master.

> Reuse KeyValue.KeyOnlyKeyValue in BufferedDataBlockEncoder.SeekerState
> --
>
> Key: HBASE-16792
> URL: https://issues.apache.org/jira/browse/HBASE-16792
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: binlijin
>Assignee: binlijin
>Priority: Minor
> Attachments: HBASE-16792_master.patch
>
>
> When every SeekerState#invalidate will new a fresh KeyValue.KeyOnlyKeyValue, 
> we should reuse it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16810) HBase Balancer throws ArrayIndexOutOfBoundsException when regionservers are in /hbase/draining znode and unloaded

2016-10-12 Thread David Pope (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Pope updated HBASE-16810:
---
Attachment: HBASE-16810.branch-1.v2.patch

Added MockNoopMasterServices.

> HBase Balancer throws ArrayIndexOutOfBoundsException when regionservers are 
> in /hbase/draining znode and unloaded
> -
>
> Key: HBASE-16810
> URL: https://issues.apache.org/jira/browse/HBASE-16810
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Ashu Pachauri
>Assignee: David Pope
> Fix For: 1.3.0
>
> Attachments: HBASE-16810.branch-1.patch, 
> HBASE-16810.branch-1.v2.patch, HBASE-16810.patch, master.patch
>
>
> 1. Add a regionserver znode under /hbase/draining znode.
> 2. Use RegionMover to unload all regions from the regionserver.
> 3. Run balancer.
> {code}
> 16/09/21 14:17:33 ERROR ipc.RpcServer: Unexpected throwable object
> java.lang.ArrayIndexOutOfBoundsException: 75
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.getLocalityOfRegion(BaseLoadBalancer.java:867)
>   at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer$LocalityCostFunction.cost(StochasticLoadBalancer.java:1186)
>   at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.computeCost(StochasticLoadBalancer.java:521)
>   at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:309)
>   at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:264)
>   at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1339)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.balance(MasterRpcServices.java:442)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:58555)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2268)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16810) HBase Balancer throws ArrayIndexOutOfBoundsException when regionservers are in /hbase/draining znode and unloaded

2016-10-12 Thread David Pope (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Pope updated HBASE-16810:
---
Attachment: (was: HBASE-16810.branch-1.v2.patch)

> HBase Balancer throws ArrayIndexOutOfBoundsException when regionservers are 
> in /hbase/draining znode and unloaded
> -
>
> Key: HBASE-16810
> URL: https://issues.apache.org/jira/browse/HBASE-16810
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Ashu Pachauri
>Assignee: David Pope
> Fix For: 1.3.0
>
> Attachments: HBASE-16810.branch-1.patch, HBASE-16810.patch, 
> master.patch
>
>
> 1. Add a regionserver znode under /hbase/draining znode.
> 2. Use RegionMover to unload all regions from the regionserver.
> 3. Run balancer.
> {code}
> 16/09/21 14:17:33 ERROR ipc.RpcServer: Unexpected throwable object
> java.lang.ArrayIndexOutOfBoundsException: 75
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.getLocalityOfRegion(BaseLoadBalancer.java:867)
>   at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer$LocalityCostFunction.cost(StochasticLoadBalancer.java:1186)
>   at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.computeCost(StochasticLoadBalancer.java:521)
>   at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:309)
>   at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:264)
>   at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1339)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.balance(MasterRpcServices.java:442)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:58555)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2268)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16810) HBase Balancer throws ArrayIndexOutOfBoundsException when regionservers are in /hbase/draining znode and unloaded

2016-10-12 Thread David Pope (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Pope updated HBASE-16810:
---
Attachment: HBASE-16810.branch-1.v2.patch

Added MockNoopMasterServices.

> HBase Balancer throws ArrayIndexOutOfBoundsException when regionservers are 
> in /hbase/draining znode and unloaded
> -
>
> Key: HBASE-16810
> URL: https://issues.apache.org/jira/browse/HBASE-16810
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Ashu Pachauri
>Assignee: David Pope
> Fix For: 1.3.0
>
> Attachments: HBASE-16810.branch-1.patch, 
> HBASE-16810.branch-1.v2.patch, HBASE-16810.patch, master.patch
>
>
> 1. Add a regionserver znode under /hbase/draining znode.
> 2. Use RegionMover to unload all regions from the regionserver.
> 3. Run balancer.
> {code}
> 16/09/21 14:17:33 ERROR ipc.RpcServer: Unexpected throwable object
> java.lang.ArrayIndexOutOfBoundsException: 75
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.getLocalityOfRegion(BaseLoadBalancer.java:867)
>   at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer$LocalityCostFunction.cost(StochasticLoadBalancer.java:1186)
>   at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.computeCost(StochasticLoadBalancer.java:521)
>   at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:309)
>   at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:264)
>   at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1339)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.balance(MasterRpcServices.java:442)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:58555)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2268)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15638) Shade protobuf

2016-10-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570791#comment-15570791
 ] 

Hudson commented on HBASE-15638:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #1776 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/1776/])
HBASE-16808 Included the generated, shaded java files from HBASE-15638 (stack: 
rev c6289f197685012eeb63b880f151499a47485e2d)
* (edit) hbase-assembly/src/main/assembly/src.xml


> Shade protobuf
> --
>
> Key: HBASE-15638
> URL: https://issues.apache.org/jira/browse/HBASE-15638
> Project: HBase
>  Issue Type: Bug
>  Components: Protobufs
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: 15638v2.patch, HBASE-15638.master.001.patch, 
> HBASE-15638.master.002.patch, HBASE-15638.master.003 (1).patch, 
> HBASE-15638.master.003 (1).patch, HBASE-15638.master.003 (1).patch, 
> HBASE-15638.master.003.patch, HBASE-15638.master.003.patch, 
> HBASE-15638.master.004.patch, HBASE-15638.master.005.patch, 
> HBASE-15638.master.006.patch, HBASE-15638.master.007.patch, 
> HBASE-15638.master.007.patch, HBASE-15638.master.008.patch, 
> HBASE-15638.master.009.patch, as.far.as.server.patch
>
>
> We need to change our protobuf. Currently it is pb2.5.0. As is, protobufs 
> expect all buffers to be on-heap byte arrays. It does not have facility for 
> dealing in ByteBuffers and off-heap ByteBuffers in particular. This fact 
> frustrates the off-heaping-of-the-write-path project as 
> marshalling/unmarshalling of protobufs involves a copy on-heap first.
> So, we need to patch our protobuf so it supports off-heap ByteBuffers. To 
> ensure we pick up the patched protobuf always, we need to relocate/shade our 
> protobuf and adjust all protobuf references accordingly.
> Given as we have protobufs in our public facing API, Coprocessor Endpoints -- 
> which use protobuf Service to describe new API -- a blind relocation/shading 
> of com.google.protobuf.* will break our API for CoProcessor EndPoints (CPEP) 
> in particular. For example, in the Table Interface, to invoke a method on a 
> registered CPEP, we have:
> {code} Map 
> coprocessorService(
> Class service, byte[] startKey, byte[] endKey, 
> org.apache.hadoop.hbase.client.coprocessor.Batch.Call 
> callable)
> throws com.google.protobuf.ServiceException, Throwable{code}
> This issue is how we intend to shade protobuf for hbase-2.0.0 while 
> preserving our API as is so CPEPs continue to work on the new hbase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16813) Procedure v2 - Move ProcedureEvent to hbase-procedure module

2016-10-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570792#comment-15570792
 ] 

Hudson commented on HBASE-16813:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #1776 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/1776/])
HBASE-16813 Procedure v2 - Move ProcedureEvent to hbase-procedure module 
(matteo.bertozzi: rev 92ef234486537b4325641ce47f6fde26d9432710)
* (add) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/AbstractProcedureScheduler.java
* (edit) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestMasterProcedureEvents.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestMasterProcedureScheduler.java
* (edit) 
hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/TestYieldProcedures.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java
* (delete) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureSimpleRunQueue.java
* (add) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureEvent.java
* (edit) 
hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/TestProcedureSuspended.java
* (add) 
hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/TestProcedureSchedulerConcurrency.java
* (add) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureEventQueue.java
* (delete) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureRunnableSet.java
* (add) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/SimpleProcedureScheduler.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestMasterProcedureSchedulerConcurrency.java
* (add) 
hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/TestProcedureEvents.java
* (add) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureScheduler.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureEnv.java
* (edit) 
hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/ProcedureTestingUtility.java


> Procedure v2 - Move ProcedureEvent to hbase-procedure module
> 
>
> Key: HBASE-16813
> URL: https://issues.apache.org/jira/browse/HBASE-16813
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
> Fix For: 2.0.0
>
> Attachments: HBASE-16813-v0.patch
>
>
> ProcedureEvent was added in MasterProcedureScheduler, but it is generic 
> enough to move to hbase-procedure module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16808) Included the generated, shaded java files from "HBASE-15638 Shade protobuf" in our src assembly

2016-10-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570790#comment-15570790
 ] 

Hudson commented on HBASE-16808:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #1776 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/1776/])
HBASE-16808 Included the generated, shaded java files from HBASE-15638 (stack: 
rev c6289f197685012eeb63b880f151499a47485e2d)
* (edit) hbase-assembly/src/main/assembly/src.xml


> Included the generated, shaded java files from "HBASE-15638 Shade protobuf" 
> in our src assembly
> ---
>
> Key: HBASE-16808
> URL: https://issues.apache.org/jira/browse/HBASE-16808
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HBASE-16808.master.001.patch
>
>
> [~Apache9] found that I forgot to include the generated, shaded files in our 
> src tgz. Let me fix. This is a follow-on to HBASE-16793



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16746) The log of “region close” should differ from “region move”

2016-10-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570793#comment-15570793
 ] 

Hudson commented on HBASE-16746:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #1776 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/1776/])
HBASE-16746 The log of “region close” should differ from “region move” (stack: 
rev dfb2a800c43b30c326dbdcec673387f7033f2092)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java


> The log of “region close” should differ from “region move”
> --
>
> Key: HBASE-16746
> URL: https://issues.apache.org/jira/browse/HBASE-16746
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Assignee: ChiaPing Tsai
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-16746.v0.patch
>
>
> If we disable some tables, we will see the following in region server log.
> {noformat}
> Close 90ed2fe1748644c6faecdec3651335d4, moving to null
> {noformat}
> The message “moving to null” is a bit confusing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16816) HMaster.move() should throw exception if region to move is not online

2016-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570779#comment-15570779
 ] 

Hadoop QA commented on HBASE-16816:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
4s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
57s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} branch-1 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 6s 
{color} | {color:red} hbase-server in branch-1 has 1 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s 
{color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
53s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
59s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
18m 53s {color} | {color:green} The patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 47s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 118m 27s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | org.apache.hadoop.hbase.client.TestReplicasClient |
|   | org.apache.hadoop.hbase.client.TestMetaWithReplicas |
|   | org.apache.hadoop.hbase.TestHColumnDescriptorDefaultVersions |
|   | org.apache.hadoop.hbase.regionserver.TestClusterId |
|   | org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:b2c5d84 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12833020/HBASE-16816-branch-1-v3.patch
 |
| JIRA Issue | HBASE-16816 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbase

[jira] [Commented] (HBASE-16489) Configuration parsing

2016-10-12 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570768#comment-15570768
 ] 

Xiaobing Zhou commented on HBASE-16489:
---

[~sudeeps] thanks for work. HDFS-9537 and HDFS-9538 have done the work of 
loading configuration, you might want to refer to them, and reuse the work if 
possible. HDFS-9632, HDFS-9791, HDFS-10787 and HDFS-10611 are the follow up 
work or fix.

> Configuration parsing
> -
>
> Key: HBASE-16489
> URL: https://issues.apache.org/jira/browse/HBASE-16489
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sudeep Sunthankar
>Assignee: Sudeep Sunthankar
> Attachments: HBASE-16489.HBASE-14850.v1.patch
>
>
> Reading hbase-site.xml is required to read various properties viz. 
> zookeeper-quorum, client retires etc.  We can either use Apache Xerces or 
> Boost libraries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16822) Enable restore-snapshot and clone-snapshot to use external specified snapshot locatioin

2016-10-12 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570758#comment-15570758
 ] 

Jerry He commented on HBASE-16822:
--

In the Backup/Restore feature under discussion, this ticket can help too by 
providing an alternative way to bulkloading during restore.  Hopefully we can 
restore/clone the full backup, which is an exported snapshot.

> Enable restore-snapshot and clone-snapshot to use external specified snapshot 
> locatioin 
> 
>
> Key: HBASE-16822
> URL: https://issues.apache.org/jira/browse/HBASE-16822
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jerry He
>
> Currently restore-snapshot and clone-snapshot only work with the snapshots 
> that are under hbase root.dir.
> In combination with export-snapshot, this means the snapshot needs to be 
> exported out to another hbase root.dir, or back and forth eventually to a 
> hbase root.dir.
> There are a few issues with the approach.
> We've known that export-snapshot has a limitation dealing with secure 
> cluster, where the external user needs to have read access to hbase root.dir 
> data, by-passing table ACL check.
> The second problem is when we try to use or bring back the exported snapshot 
> for restore/clone.  They have to be in the target hbase root.dir, and needs 
> write permission to get it in there.
> Again we will have permission problem.
> This ticket tries to deal with the second problem, clone and restore from 
> exported snapshots.  The exported snapshots can be on the same cluster, but 
> the user may not have write permission to move them to hbase root.dir.
> We should have a solution that allow clone/restore snapshot from an external 
> path that keeps snapshot backups. And also do it with security permission in 
> mind.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16664) Timeout logic in AsyncProcess is broken

2016-10-12 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570746#comment-15570746
 ] 

Heng Chen commented on HBASE-16664:
---

Failed test cases for v7 seems not related.  If you are not hurry,  i will 
commit it later today.  Otherwise,  you could ask [~Apache9] to commit it.  
Thanks for your working, [~yangzhe1991] !  :)  

> Timeout logic in AsyncProcess is broken
> ---
>
> Key: HBASE-16664
> URL: https://issues.apache.org/jira/browse/HBASE-16664
> Project: HBase
>  Issue Type: Bug
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: 1.patch, HBASE-16664-branch-1-v1.patch, 
> HBASE-16664-branch-1-v1.patch, HBASE-16664-branch-1-v2.patch, 
> HBASE-16664-branch-1.1-v1.patch, HBASE-16664-branch-1.2-v1.patch, 
> HBASE-16664-branch-1.3-v1.patch, HBASE-16664-branch-1.3-v2.patch, 
> HBASE-16664-v1.patch, HBASE-16664-v2.patch, HBASE-16664-v3.patch, 
> HBASE-16664-v4.patch, HBASE-16664-v5.patch, HBASE-16664-v6.patch, 
> HBASE-16664-v7.patch, testhcm.patch
>
>
> Rpc/operation timeout logic in AsyncProcess is broken. And Table's 
> set*Timeout does not take effect in its AP or BufferedMutator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16822) Enable restore-snapshot and clone-snapshot to use external specified snapshot locatioin

2016-10-12 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-16822:
-
Description: 
Currently restore-snapshot and clone-snapshot only work with the snapshots that 
are under hbase root.dir.

In combination with export-snapshot, this means the snapshot needs to be 
exported out to another hbase root.dir, or back and forth eventually to a hbase 
root.dir.

There are a few issues with the approach.

We've known that export-snapshot has a limitation dealing with secure cluster, 
where the external user needs to have read access to hbase root.dir data, 
by-passing table ACL check.

The second problem is when we try to use or bring back the exported snapshot 
for restore/clone.  They have to be in the target hbase root.dir, and needs 
write permission to get it in there.

Again we will have permission problem.

This ticket tries to deal with the second problem, clone and restore from 
exported snapshots.  The exported snapshots can be on the same cluster, but the 
user may not have write permission to move them to hbase root.dir.

We should have a solution that allow clone/restore snapshot from an external 
path that keeps snapshot backups. And also do it with security permission in 
mind.



  was:
Currently restore-snapshot and clone-snapshot only work with the snapshots that 
are under hbase root.dir.

In combination with export-snapshot, this means the snapshot needs to be 
exported out to another hbase root.dir, or back and forth eventually to a hbase 
root.dir.

There are a few issues with the approach.

We've know that export-snapshot has a limitation dealing with secure cluster, 
where the external user needs to have read access to hbase root.dir data, 
by-passing table ACL check.

The second problem is when we try to use or bring back the exported snapshot 
for restore/clone.  They have to in the target hbase root.dir, and needs write 
permission to get it in there.

Again we will have permission problem.

This ticket tries to deal with the second problem, clone and restore from a 
exported snapshot.  The exported snapshots can be on the same cluster but the 
user may not have write permission to move it to hbase.root.dir.

We should have a solution that allow clone/restore snapshot from a external 
path that keeps snapshot backups. And also do it with security permission in 
mind.




> Enable restore-snapshot and clone-snapshot to use external specified snapshot 
> locatioin 
> 
>
> Key: HBASE-16822
> URL: https://issues.apache.org/jira/browse/HBASE-16822
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jerry He
>
> Currently restore-snapshot and clone-snapshot only work with the snapshots 
> that are under hbase root.dir.
> In combination with export-snapshot, this means the snapshot needs to be 
> exported out to another hbase root.dir, or back and forth eventually to a 
> hbase root.dir.
> There are a few issues with the approach.
> We've known that export-snapshot has a limitation dealing with secure 
> cluster, where the external user needs to have read access to hbase root.dir 
> data, by-passing table ACL check.
> The second problem is when we try to use or bring back the exported snapshot 
> for restore/clone.  They have to be in the target hbase root.dir, and needs 
> write permission to get it in there.
> Again we will have permission problem.
> This ticket tries to deal with the second problem, clone and restore from 
> exported snapshots.  The exported snapshots can be on the same cluster, but 
> the user may not have write permission to move them to hbase root.dir.
> We should have a solution that allow clone/restore snapshot from an external 
> path that keeps snapshot backups. And also do it with security permission in 
> mind.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16578) Mob data loss after mob compaction and normal compcation

2016-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570718#comment-15570718
 ] 

Hadoop QA commented on HBASE-16578:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
8s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
45s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
38s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
26m 33s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 45s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
13s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 115m 21s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | org.apache.hadoop.hbase.constraint.TestConstraint |
|   | org.apache.hadoop.hbase.filter.TestFuzzyRowAndColumnRangeFilter |
|   | org.apache.hadoop.hbase.TestNamespace |
|   | 
org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDeletes |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12833016/HBASE-16578-V2.patch |
| JIRA Issue | HBASE-16578 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 9a34a7b31a73 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 92ef234 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/3973/artifact/patchprocess/patch-unit-hbase-server.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HBASE-Build/3973/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/3973/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/3973/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


[jira] [Commented] (HBASE-16505) Save deadline in RpcCallContext according to request's timeout

2016-10-12 Thread Phil Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570712#comment-15570712
 ] 

Phil Yang commented on HBASE-16505:
---

Thanks all for reviewing, thanks [~stack] for committing.

> Save deadline in RpcCallContext according to request's timeout
> --
>
> Key: HBASE-16505
> URL: https://issues.apache.org/jira/browse/HBASE-16505
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Phil Yang
>Assignee: Phil Yang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16505-branch-1-v1.patch, 
> HBASE-16505-branch-1-v2.patch, HBASE-16505-branch-1-v3.patch, 
> HBASE-16505-v1.patch, HBASE-16505-v10.patch, HBASE-16505-v10.patch, 
> HBASE-16505-v11.patch, HBASE-16505-v11.patch, HBASE-16505-v12.patch, 
> HBASE-16505-v13.patch, HBASE-16505-v2.patch, HBASE-16505-v3.patch, 
> HBASE-16505-v4.patch, HBASE-16505-v5.patch, HBASE-16505-v6.patch, 
> HBASE-16505-v7.patch, HBASE-16505-v8.patch, HBASE-16505-v9.patch
>
>
> If we want to know the correct setting of timeout in read/write path, we need 
> add a new parameter in operation-methods of Region.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16822) Enable restore-snapshot and clone-snapshot to use external specified snapshot locatioin

2016-10-12 Thread Jerry He (JIRA)
Jerry He created HBASE-16822:


 Summary: Enable restore-snapshot and clone-snapshot to use 
external specified snapshot locatioin 
 Key: HBASE-16822
 URL: https://issues.apache.org/jira/browse/HBASE-16822
 Project: HBase
  Issue Type: Improvement
Reporter: Jerry He


Currently restore-snapshot and clone-snapshot only work with the snapshots that 
are under hbase root.dir.

In combination with export-snapshot, this means the snapshot needs to be 
exported out to another hbase root.dir, or back and forth eventually to a hbase 
root.dir.

There are a few issues with the approach.

We've know that export-snapshot has a limitation dealing with secure cluster, 
where the external user needs to have read access to hbase root.dir data, 
by-passing table ACL check.

The second problem is when we try to use or bring back the exported snapshot 
for restore/clone.  They have to in the target hbase root.dir, and needs write 
permission to get it in there.

Again we will have permission problem.

This ticket tries to deal with the second problem, clone and restore from a 
exported snapshot.  The exported snapshots can be on the same cluster but the 
user may not have write permission to move it to hbase.root.dir.

We should have a solution that allow clone/restore snapshot from a external 
path that keeps snapshot backups. And also do it with security permission in 
mind.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16807) RegionServer will fail to report new active Hmaster until HMaster/RegionServer failover

2016-10-12 Thread Pankaj Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pankaj Kumar updated HBASE-16807:
-
Description: 
It's little weird, but it happened in the product environment that few 
RegionServer missed master znode create notification on master failover. In 
that case ZooKeeperNodeTracker will not refresh the cached data and 
MasterAddressTracker will always return old active HM detail to Region server 
on ServiceException.

Though We create region server stub on failure but without refreshing the 
MasterAddressTracker data.

In HRegionServer.createRegionServerStatusStub()
{code}
  boolean refresh = false; // for the first time, use cached data
RegionServerStatusService.BlockingInterface intf = null;
boolean interrupted = false;
try {
  while (keepLooping()) {
sn = this.masterAddressTracker.getMasterAddress(refresh);
if (sn == null) {
  if (!keepLooping()) {
// give up with no connection.
LOG.debug("No master found and cluster is stopped; bailing out");
return null;
  }
  if (System.currentTimeMillis() > (previousLogTime + 1000)) {
LOG.debug("No master found; retry");
previousLogTime = System.currentTimeMillis();
  }
  refresh = true; // let's try pull it from ZK directly
  if (sleep(200)) {
interrupted = true;
  }
  continue;
}
{code}

Here we refresh node only when 'sn' is NULL otherwise it will use same cached 
data. 

So in above case RegionServer will never report active HMaster successfully 
until HMaster failover or RegionServer restart.

  was:
It's little weird, but it happened in the product environment that few 
RegionServer missed master znode create notification on master failover. In 
that case ZooKeeperNodeTracker will not refresh the cached data and 
MasterAddressTracker 
will always return old active HM detail to Region server on ServiceException.

Though We create region server stub on failure but without refreshing the 
MasterAddressTracker data.

In HRegionServer.createRegionServerStatusStub()
{code}
  boolean refresh = false; // for the first time, use cached data
RegionServerStatusService.BlockingInterface intf = null;
boolean interrupted = false;
try {
  while (keepLooping()) {
sn = this.masterAddressTracker.getMasterAddress(refresh);
if (sn == null) {
  if (!keepLooping()) {
// give up with no connection.
LOG.debug("No master found and cluster is stopped; bailing out");
return null;
  }
  if (System.currentTimeMillis() > (previousLogTime + 1000)) {
LOG.debug("No master found; retry");
previousLogTime = System.currentTimeMillis();
  }
  refresh = true; // let's try pull it from ZK directly
  if (sleep(200)) {
interrupted = true;
  }
  continue;
}
{code}

Here we refresh node only when 'sn' is NULL otherwise it will use same cached 
data. 

So in above case RegionServer will never report active HMaster successfully 
until HMaster failover or RegionServer restart.


> RegionServer will fail to report new active Hmaster until 
> HMaster/RegionServer failover
> ---
>
> Key: HBASE-16807
> URL: https://issues.apache.org/jira/browse/HBASE-16807
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16807.patch
>
>
> It's little weird, but it happened in the product environment that few 
> RegionServer missed master znode create notification on master failover. In 
> that case ZooKeeperNodeTracker will not refresh the cached data and 
> MasterAddressTracker will always return old active HM detail to Region server 
> on ServiceException.
> Though We create region server stub on failure but without refreshing the 
> MasterAddressTracker data.
> In HRegionServer.createRegionServerStatusStub()
> {code}
>   boolean refresh = false; // for the first time, use cached data
> RegionServerStatusService.BlockingInterface intf = null;
> boolean interrupted = false;
> try {
>   while (keepLooping()) {
> sn = this.masterAddressTracker.getMasterAddress(refresh);
> if (sn == null) {
>   if (!keepLooping()) {
> // give up with no connection.
> LOG.debug("No master found and cluster is stopped; bailing out");
> return null;
>   }
>   if (System.currentTimeMillis() > (previousLogTime + 1000)) {
> LOG.debug("No master found; retry");
> previousLogTime = System.currentTi

[jira] [Commented] (HBASE-16807) RegionServer will fail to report new active Hmaster until HMaster/RegionServer failover

2016-10-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570663#comment-15570663
 ] 

Ted Yu commented on HBASE-16807:


lgtm

> RegionServer will fail to report new active Hmaster until 
> HMaster/RegionServer failover
> ---
>
> Key: HBASE-16807
> URL: https://issues.apache.org/jira/browse/HBASE-16807
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16807.patch
>
>
> It's little weird, but it happened in the product environment that few 
> RegionServer missed master znode create notification on master failover. In 
> that case ZooKeeperNodeTracker will not refresh the cached data and 
> MasterAddressTracker 
> will always return old active HM detail to Region server on ServiceException.
> Though We create region server stub on failure but without refreshing the 
> MasterAddressTracker data.
> In HRegionServer.createRegionServerStatusStub()
> {code}
>   boolean refresh = false; // for the first time, use cached data
> RegionServerStatusService.BlockingInterface intf = null;
> boolean interrupted = false;
> try {
>   while (keepLooping()) {
> sn = this.masterAddressTracker.getMasterAddress(refresh);
> if (sn == null) {
>   if (!keepLooping()) {
> // give up with no connection.
> LOG.debug("No master found and cluster is stopped; bailing out");
> return null;
>   }
>   if (System.currentTimeMillis() > (previousLogTime + 1000)) {
> LOG.debug("No master found; retry");
> previousLogTime = System.currentTimeMillis();
>   }
>   refresh = true; // let's try pull it from ZK directly
>   if (sleep(200)) {
> interrupted = true;
>   }
>   continue;
> }
> {code}
> Here we refresh node only when 'sn' is NULL otherwise it will use same cached 
> data. 
> So in above case RegionServer will never report active HMaster successfully 
> until HMaster failover or RegionServer restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16716) OfflineMetaRepair leaves empty directory inside /hbase/WALs which remains forever

2016-10-12 Thread Pankaj Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570658#comment-15570658
 ] 

Pankaj Kumar commented on HBASE-16716:
--

We should commit this in other branches (1.1/1.2/1.3) as well, IMHO.


> OfflineMetaRepair leaves empty directory inside /hbase/WALs which remains 
> forever
> -
>
> Key: HBASE-16716
> URL: https://issues.apache.org/jira/browse/HBASE-16716
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 2.0.0
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16716-V2.patch, HBASE-16716-branch-1.patch, 
> HBASE-16716.patch
>
>
> OfflineMetaRepair rebuild Meta table, while creating meta region it creates 
> it's own WAL (inside /hbase/WALs/hbck-meta-recovery-) which wll 
> be closed and archived after rebuilding Meta. 
> {noformat}
> hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
> >> /hbase/WALs/hbck-meta-recovery-
> {noformat}
> It doesn't clear the empty dir, empty directory should be removed after 
> success.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16807) RegionServer will fail to report new active Hmaster until HMaster/RegionServer failover

2016-10-12 Thread Pankaj Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pankaj Kumar updated HBASE-16807:
-
Fix Version/s: 2.0.0
   Status: Patch Available  (was: Open)

Simple patch, please review.

> RegionServer will fail to report new active Hmaster until 
> HMaster/RegionServer failover
> ---
>
> Key: HBASE-16807
> URL: https://issues.apache.org/jira/browse/HBASE-16807
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16807.patch
>
>
> It's little weird, but it happened in the product environment that few 
> RegionServer missed master znode create notification on master failover. In 
> that case ZooKeeperNodeTracker will not refresh the cached data and 
> MasterAddressTracker 
> will always return old active HM detail to Region server on ServiceException.
> Though We create region server stub on failure but without refreshing the 
> MasterAddressTracker data.
> In HRegionServer.createRegionServerStatusStub()
> {code}
>   boolean refresh = false; // for the first time, use cached data
> RegionServerStatusService.BlockingInterface intf = null;
> boolean interrupted = false;
> try {
>   while (keepLooping()) {
> sn = this.masterAddressTracker.getMasterAddress(refresh);
> if (sn == null) {
>   if (!keepLooping()) {
> // give up with no connection.
> LOG.debug("No master found and cluster is stopped; bailing out");
> return null;
>   }
>   if (System.currentTimeMillis() > (previousLogTime + 1000)) {
> LOG.debug("No master found; retry");
> previousLogTime = System.currentTimeMillis();
>   }
>   refresh = true; // let's try pull it from ZK directly
>   if (sleep(200)) {
> interrupted = true;
>   }
>   continue;
> }
> {code}
> Here we refresh node only when 'sn' is NULL otherwise it will use same cached 
> data. 
> So in above case RegionServer will never report active HMaster successfully 
> until HMaster failover or RegionServer restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16807) RegionServer will fail to report new active Hmaster until HMaster/RegionServer failover

2016-10-12 Thread Pankaj Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pankaj Kumar updated HBASE-16807:
-
Attachment: HBASE-16807.patch

> RegionServer will fail to report new active Hmaster until 
> HMaster/RegionServer failover
> ---
>
> Key: HBASE-16807
> URL: https://issues.apache.org/jira/browse/HBASE-16807
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
> Attachments: HBASE-16807.patch
>
>
> It's little weird, but it happened in the product environment that few 
> RegionServer missed master znode create notification on master failover. In 
> that case ZooKeeperNodeTracker will not refresh the cached data and 
> MasterAddressTracker 
> will always return old active HM detail to Region server on ServiceException.
> Though We create region server stub on failure but without refreshing the 
> MasterAddressTracker data.
> In HRegionServer.createRegionServerStatusStub()
> {code}
>   boolean refresh = false; // for the first time, use cached data
> RegionServerStatusService.BlockingInterface intf = null;
> boolean interrupted = false;
> try {
>   while (keepLooping()) {
> sn = this.masterAddressTracker.getMasterAddress(refresh);
> if (sn == null) {
>   if (!keepLooping()) {
> // give up with no connection.
> LOG.debug("No master found and cluster is stopped; bailing out");
> return null;
>   }
>   if (System.currentTimeMillis() > (previousLogTime + 1000)) {
> LOG.debug("No master found; retry");
> previousLogTime = System.currentTimeMillis();
>   }
>   refresh = true; // let's try pull it from ZK directly
>   if (sleep(200)) {
> interrupted = true;
>   }
>   continue;
> }
> {code}
> Here we refresh node only when 'sn' is NULL otherwise it will use same cached 
> data. 
> So in above case RegionServer will never report active HMaster successfully 
> until HMaster failover or RegionServer restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16816) HMaster.move() should throw exception if region to move is not online

2016-10-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16816:
---
 Hadoop Flags: Reviewed
Fix Version/s: 1.4.0

> HMaster.move() should throw exception if region to move is not online
> -
>
> Key: HBASE-16816
> URL: https://issues.apache.org/jira/browse/HBASE-16816
> Project: HBase
>  Issue Type: Bug
>  Components: Admin
>Affects Versions: 1.1.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Minor
> Fix For: 1.4.0
>
> Attachments: HBASE-16816-branch-1-v2.patch, 
> HBASE-16816-branch-1-v3.patch, HBASE-16816-branch-1.patch
>
>
> The move region function in HMaster only checks whether the region to move 
> exists
> {code}
> if (regionState == null) {
>   throw new 
> UnknownRegionException(Bytes.toStringBinary(encodedRegionName));
> }
> {code}
> It will not return anything if the region is split or in transition which is 
> not movable. So the caller has no way to know if the move region operation is 
> failed.
> It is a problem for "region_move.rb". It only gives up moving a region if a 
> exception is thrown.Otherwise, it will wait until a timeout and retry. 
> Without a exception, it have no idea the region is not movable.
> {code}
> begin
>   admin.move(Bytes.toBytes(r.getEncodedName()), Bytes.toBytes(newServer))
> rescue java.lang.reflect.UndeclaredThrowableException,
> org.apache.hadoop.hbase.UnknownRegionException => e
>   $LOG.info("Exception moving "  + r.getEncodedName() +
> "; split/moved? Continuing: " + e)
>   return
> end
>  # Wait till its up on new server before moving on
> maxWaitInSeconds = admin.getConfiguration.getInt("hbase.move.wait.max", 
> 60)
> maxWait = Time.now + maxWaitInSeconds
> while Time.now < maxWait
>   same = isSameServer(admin, r, original)
>   break unless same
>   sleep 0.1
> end
>   end
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16816) HMaster.move() should throw exception if region to move is not online

2016-10-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570627#comment-15570627
 ] 

Ted Yu commented on HBASE-16816:


+1

> HMaster.move() should throw exception if region to move is not online
> -
>
> Key: HBASE-16816
> URL: https://issues.apache.org/jira/browse/HBASE-16816
> Project: HBase
>  Issue Type: Bug
>  Components: Admin
>Affects Versions: 1.1.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Minor
> Attachments: HBASE-16816-branch-1-v2.patch, 
> HBASE-16816-branch-1-v3.patch, HBASE-16816-branch-1.patch
>
>
> The move region function in HMaster only checks whether the region to move 
> exists
> {code}
> if (regionState == null) {
>   throw new 
> UnknownRegionException(Bytes.toStringBinary(encodedRegionName));
> }
> {code}
> It will not return anything if the region is split or in transition which is 
> not movable. So the caller has no way to know if the move region operation is 
> failed.
> It is a problem for "region_move.rb". It only gives up moving a region if a 
> exception is thrown.Otherwise, it will wait until a timeout and retry. 
> Without a exception, it have no idea the region is not movable.
> {code}
> begin
>   admin.move(Bytes.toBytes(r.getEncodedName()), Bytes.toBytes(newServer))
> rescue java.lang.reflect.UndeclaredThrowableException,
> org.apache.hadoop.hbase.UnknownRegionException => e
>   $LOG.info("Exception moving "  + r.getEncodedName() +
> "; split/moved? Continuing: " + e)
>   return
> end
>  # Wait till its up on new server before moving on
> maxWaitInSeconds = admin.getConfiguration.getInt("hbase.move.wait.max", 
> 60)
> maxWait = Time.now + maxWaitInSeconds
> while Time.now < maxWait
>   same = isSameServer(admin, r, original)
>   break unless same
>   sleep 0.1
> end
>   end
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16716) OfflineMetaRepair leaves empty directory inside /hbase/WALs which remains forever

2016-10-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16716:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 1.4.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch, Pankaj.

> OfflineMetaRepair leaves empty directory inside /hbase/WALs which remains 
> forever
> -
>
> Key: HBASE-16716
> URL: https://issues.apache.org/jira/browse/HBASE-16716
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 2.0.0
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16716-V2.patch, HBASE-16716-branch-1.patch, 
> HBASE-16716.patch
>
>
> OfflineMetaRepair rebuild Meta table, while creating meta region it creates 
> it's own WAL (inside /hbase/WALs/hbck-meta-recovery-) which wll 
> be closed and archived after rebuilding Meta. 
> {noformat}
> hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
> >> /hbase/WALs/hbck-meta-recovery-
> {noformat}
> It doesn't clear the empty dir, empty directory should be removed after 
> success.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570534#comment-15570534
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 2:40 AM:


Ooops, may be a bug.

Update:

No, it can not be. When tool starts we create instance of Configuration:

{code}
  public static void main(String[] args) throws Exception {
Configuration conf = HBaseConfiguration.create();
int ret = ToolRunner.run(conf, new BackupDriver(), args);
System.exit(ret);
  }
{code}

That is what we get from CLASSPATH. Check your setup, [~saint@gmail.com]





was (Author: vrodionov):
Ooops, may be a bug.




> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backup. 
> We would support the following key features and capabilities.
> * Full backup uses HBase snapshot to capture HFiles.
> * Use HBase WALs to capture incremental changes, but we use bulk load of 
> HFiles for fast incremental

[jira] [Commented] (HBASE-16414) Improve performance for RPC encryption with Apache Common Crypto

2016-10-12 Thread Colin Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570606#comment-15570606
 ] 

Colin Ma commented on HBASE-16414:
--

[~Apache9], thanks for your review on the client part. The patch is updated 
according to your comments, the most important change is new handler added to 
process the response for connection header.
Feel free for any comment, thanks.

> Improve performance for RPC encryption with Apache Common Crypto
> 
>
> Key: HBASE-16414
> URL: https://issues.apache.org/jira/browse/HBASE-16414
> Project: HBase
>  Issue Type: Improvement
>  Components: IPC/RPC
>Affects Versions: 2.0.0
>Reporter: Colin Ma
>Assignee: Colin Ma
> Attachments: HBASE-16414.001.patch, HBASE-16414.002.patch, 
> HBASE-16414.003.patch, HBASE-16414.004.patch, HBASE-16414.005.patch, 
> HBASE-16414.006.patch, HBASE-16414.007.patch, HBASE-16414.008.patch, 
> HbaseRpcEncryptionWithCrypoto.docx
>
>
> Hbase RPC encryption is enabled by setting “hbase.rpc.protection” to 
> "privacy". With the token authentication, it utilized DIGEST-MD5 mechanisms 
> for secure authentication and data protection. For DIGEST-MD5, it uses DES, 
> 3DES or RC4 to do encryption and it is very slow, especially for Scan. This 
> will become the bottleneck of the RPC throughput.
> Apache Commons Crypto is a cryptographic library optimized with AES-NI. It 
> provides Java API for both cipher level and Java stream level. Developers can 
> use it to implement high performance AES encryption/decryption with the 
> minimum code and effort. Compare with the current implementation of 
> org.apache.hadoop.hbase.io.crypto.aes.AES, Crypto supports both JCE Cipher 
> and OpenSSL Cipher which is better performance than JCE Cipher. User can 
> configure the cipher type and the default is JCE Cipher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16749) HBase root pom.xml contains repo from people.apache.org/~garyh

2016-10-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570601#comment-15570601
 ] 

Hudson commented on HBASE-16749:


SUCCESS: Integrated in Jenkins build HBase-1.2-JDK8 #41 (See 
[https://builds.apache.org/job/HBase-1.2-JDK8/41/])
HBASE-16749 Remove defunct repo from people.apache.org (garyh: rev 
211559e6d012b52b3e0fc8ecbd1a71856c9ae06c)
* (edit) pom.xml


> HBase root pom.xml contains repo from people.apache.org/~garyh
> --
>
> Key: HBASE-16749
> URL: https://issues.apache.org/jira/browse/HBASE-16749
> Project: HBase
>  Issue Type: Task
>Reporter: Wangda Tan
>Assignee: Gary Helmling
> Fix For: 1.2.5, 1.1.8
>
> Attachments: HBASE-16749.branch-1.1.patch, 
> HBASE-16749.branch-1.2.patch, HBASE-16749.branch-1.2.patch
>
>
> This is found when I was building Hadoop/YARN:
> {code}
> [INFO] 
> 
> [INFO] Building Apache Hadoop YARN Timeline Service 3.0.0-alpha2-SNAPSHOT
> [INFO] 
> 
> Downloading: 
> http://conjars.org/repo/org/apache/hadoop/hadoop-client/3.0.0-alpha2-SNAPSHOT/maven-metadata.xml
> ...
> Downloading: 
> http://people.apache.org/~garyh/mvn/org/apache/hadoop/hadoop-client/3.0.0-alpha2-SNAPSHOT/maven-metadata.xml
> ...
> Downloading: 
> http://people.apache.org/~garyh/mvn/org/apache/hadoop/hadoop-client/3.0.0-alpha2-SNAPSHOT/maven-metadata.xml
> {code}
> [~te...@apache.org] mentioned:
> bq. Among hbase releases (1.1.x, 1.2.y), root pom.xml contains 
> "ghelmling.testing" repo. 
> This private repo sometimes extends the Hadoop build time a lot. It might be 
> better to use ASF snapshot repo first before downloading from private repo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-16786) Procedure V2 - Move ZK-lock's uses to Procedure framework locks (LockProcedure)

2016-10-12 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570583#comment-15570583
 ] 

Jingcheng Du edited comment on HBASE-16786 at 10/13/16 2:34 AM:


Hi [~appy].
The lock in MOB is trying to avoid making deleted cells live again.
If we always retain the delete markers - until they are expired - in major 
compaction like what we do now in major compaction when write lock cannot be 
acquired, the locks are not needed anymore. How do you think about this 
approach? [~jmhsieh], [~mbertozzi].


was (Author: jingcheng...@intel.com):
Hi [~appy].
The lock in MOB is trying to avoid making deleted cells live again.
If we always retain the delete markers in major compaction, the locks are not 
needed anymore. How do you think about this approach? [~jmhsieh], [~mbertozzi].

> Procedure V2 - Move ZK-lock's uses to Procedure framework locks 
> (LockProcedure)
> ---
>
> Key: HBASE-16786
> URL: https://issues.apache.org/jira/browse/HBASE-16786
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Appy
>Assignee: Appy
> Attachments: HBASE-16786.master.001.patch, 
> HBASE-16786.master.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16414) Improve performance for RPC encryption with Apache Common Crypto

2016-10-12 Thread Colin Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Ma updated HBASE-16414:
-
Attachment: HBASE-16414.008.patch

> Improve performance for RPC encryption with Apache Common Crypto
> 
>
> Key: HBASE-16414
> URL: https://issues.apache.org/jira/browse/HBASE-16414
> Project: HBase
>  Issue Type: Improvement
>  Components: IPC/RPC
>Affects Versions: 2.0.0
>Reporter: Colin Ma
>Assignee: Colin Ma
> Attachments: HBASE-16414.001.patch, HBASE-16414.002.patch, 
> HBASE-16414.003.patch, HBASE-16414.004.patch, HBASE-16414.005.patch, 
> HBASE-16414.006.patch, HBASE-16414.007.patch, HBASE-16414.008.patch, 
> HbaseRpcEncryptionWithCrypoto.docx
>
>
> Hbase RPC encryption is enabled by setting “hbase.rpc.protection” to 
> "privacy". With the token authentication, it utilized DIGEST-MD5 mechanisms 
> for secure authentication and data protection. For DIGEST-MD5, it uses DES, 
> 3DES or RC4 to do encryption and it is very slow, especially for Scan. This 
> will become the bottleneck of the RPC throughput.
> Apache Commons Crypto is a cryptographic library optimized with AES-NI. It 
> provides Java API for both cipher level and Java stream level. Developers can 
> use it to implement high performance AES encryption/decryption with the 
> minimum code and effort. Compare with the current implementation of 
> org.apache.hadoop.hbase.io.crypto.aes.AES, Crypto supports both JCE Cipher 
> and OpenSSL Cipher which is better performance than JCE Cipher. User can 
> configure the cipher type and the default is JCE Cipher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16578) Mob data loss after mob compaction and normal compcation

2016-10-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570596#comment-15570596
 ] 

Ted Yu commented on HBASE-16578:


[~mbertozzi]:
Do you want to take a look ?

Thanks

> Mob data loss after mob compaction and normal compcation
> 
>
> Key: HBASE-16578
> URL: https://issues.apache.org/jira/browse/HBASE-16578
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.0.0
>Reporter: huaxiang sun
>Assignee: Jingcheng Du
> Attachments: HBASE-16578-V2.patch, HBASE-16578.patch, 
> TestMobCompaction.java, TestMobCompaction.java
>
>
> StoreFileScanners on MOB cells rely on the scannerOrder to find the latest 
> cells after mob compaction. The value of scannerOrder is assigned by the 
> order of maxSeqId of StoreFile, and this maxSeqId is valued only after the 
> reader of the StoreFile is created.
> In {{Compactor.compact}}, the compacted store files are cloned and their 
> readers are not created. And in {{StoreFileScanner.getScannersForStoreFiles}} 
> the StoreFiles are sorted before the readers are created and at that time the 
> maxSeqId for each file is -1 (the default value). This will lead  to a chaos 
> in scanners in the following normal compaction. Some older cells might be 
> chosen during the normal compaction.
> We need to create readers either before the sorting in the method 
> {{StoreFileScanner.getScannersForStoreFiles}}, or create readers just after 
> the store files are cloned in {{Compactor.compact}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16716) OfflineMetaRepair leaves empty directory inside /hbase/WALs which remains forever

2016-10-12 Thread Pankaj Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570594#comment-15570594
 ] 

Pankaj Kumar commented on HBASE-16716:
--

Test case failure is not relevant, locally it is successful.

Findbug warning is also not relevant, 
{noformat}
Return value of java.util.concurrent.CountDownLatch.await(long, TimeUnit) 
ignored in 
org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper()
{noformat}

> OfflineMetaRepair leaves empty directory inside /hbase/WALs which remains 
> forever
> -
>
> Key: HBASE-16716
> URL: https://issues.apache.org/jira/browse/HBASE-16716
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 2.0.0
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-16716-V2.patch, HBASE-16716-branch-1.patch, 
> HBASE-16716.patch
>
>
> OfflineMetaRepair rebuild Meta table, while creating meta region it creates 
> it's own WAL (inside /hbase/WALs/hbck-meta-recovery-) which wll 
> be closed and archived after rebuilding Meta. 
> {noformat}
> hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
> >> /hbase/WALs/hbck-meta-recovery-
> {noformat}
> It doesn't clear the empty dir, empty directory should be removed after 
> success.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-16786) Procedure V2 - Move ZK-lock's uses to Procedure framework locks (LockProcedure)

2016-10-12 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570583#comment-15570583
 ] 

Jingcheng Du edited comment on HBASE-16786 at 10/13/16 2:28 AM:


Hi [~appy].
The lock in MOB is trying to avoid making deleted cells live again.
If we always retain the delete markers in major compaction, the locks are not 
needed anymore. How do you think about this approach? [~jmhsieh], [~mbertozzi].


was (Author: jingcheng...@intel.com):
Hi [~appy].
The locks is trying to avoid making deleted cells live again.
If we always retain the delete markers in major compaction, the locks are not 
needed anymore. How do you think about this approach? [~jmhsieh], [~mbertozzi].

> Procedure V2 - Move ZK-lock's uses to Procedure framework locks 
> (LockProcedure)
> ---
>
> Key: HBASE-16786
> URL: https://issues.apache.org/jira/browse/HBASE-16786
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Appy
>Assignee: Appy
> Attachments: HBASE-16786.master.001.patch, 
> HBASE-16786.master.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16786) Procedure V2 - Move ZK-lock's uses to Procedure framework locks (LockProcedure)

2016-10-12 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570583#comment-15570583
 ] 

Jingcheng Du commented on HBASE-16786:
--

Hi [~appy].
The locks is trying to avoid making deleted cells live again.
If we always retain the delete markers in major compaction, the locks are not 
needed anymore. How do you think about this approach? [~jmhsieh], [~mbertozzi].

> Procedure V2 - Move ZK-lock's uses to Procedure framework locks 
> (LockProcedure)
> ---
>
> Key: HBASE-16786
> URL: https://issues.apache.org/jira/browse/HBASE-16786
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Appy
>Assignee: Appy
> Attachments: HBASE-16786.master.001.patch, 
> HBASE-16786.master.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570534#comment-15570534
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 2:15 AM:


Ooops, may be a bug.





was (Author: vrodionov):
[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to configure HBase cluster. From 
your log file, we  see that LocalFileSystem is used instead of hdfs, that looks 
like your own configuration problem.  If you have cluster, go to any node to 
HBASE_HOME/bin and run backup command. If it fails, then you will not be able 
to run any hbase command, including 'shell'. Can you run "hbase shell" and 
connect to cluster?




> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backup. 
> We would support the following key features and capabilities.
> * Full backup uses HBase snapshot to capture HFiles.
> * Use HBase WA

[jira] [Updated] (HBASE-16816) HMaster.move() should throw exception if region to move is not online

2016-10-12 Thread Allan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-16816:
---
Attachment: HBASE-16816-branch-1-v3.patch

Use TEST_UTIL.waitUntilNoRegionsInTransition instead of sleeping.

> HMaster.move() should throw exception if region to move is not online
> -
>
> Key: HBASE-16816
> URL: https://issues.apache.org/jira/browse/HBASE-16816
> Project: HBase
>  Issue Type: Bug
>  Components: Admin
>Affects Versions: 1.1.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Minor
> Attachments: HBASE-16816-branch-1-v2.patch, 
> HBASE-16816-branch-1-v3.patch, HBASE-16816-branch-1.patch
>
>
> The move region function in HMaster only checks whether the region to move 
> exists
> {code}
> if (regionState == null) {
>   throw new 
> UnknownRegionException(Bytes.toStringBinary(encodedRegionName));
> }
> {code}
> It will not return anything if the region is split or in transition which is 
> not movable. So the caller has no way to know if the move region operation is 
> failed.
> It is a problem for "region_move.rb". It only gives up moving a region if a 
> exception is thrown.Otherwise, it will wait until a timeout and retry. 
> Without a exception, it have no idea the region is not movable.
> {code}
> begin
>   admin.move(Bytes.toBytes(r.getEncodedName()), Bytes.toBytes(newServer))
> rescue java.lang.reflect.UndeclaredThrowableException,
> org.apache.hadoop.hbase.UnknownRegionException => e
>   $LOG.info("Exception moving "  + r.getEncodedName() +
> "; split/moved? Continuing: " + e)
>   return
> end
>  # Wait till its up on new server before moving on
> maxWaitInSeconds = admin.getConfiguration.getInt("hbase.move.wait.max", 
> 60)
> maxWait = Time.now + maxWaitInSeconds
> while Time.now < maxWait
>   same = isSameServer(admin, r, original)
>   break unless same
>   sleep 0.1
> end
>   end
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16812) Cleanup deprecated compact() function

2016-10-12 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570560#comment-15570560
 ] 

Jingcheng Du commented on HBASE-16812:
--

bq. I tried to find the locks for CompactionTool and ExpiredMobFileCleaner but 
couldn't find it. Can you please point them to me, maybe github links. Thanks.
I reviewed the code in CompactionTool, it is a dummy HStore. Yes, there is no 
lock for it. I can add it.
For ExpiredMobFileCleaner, it has two entries, one is the 
{{ExpiredMobFileCleanerChore}} where read lock is used. Another is the tool, 
yes, we don't have locks for it. I can add it. Sorry for misunderstanding your 
words.

> Cleanup deprecated compact() function
> -
>
> Key: HBASE-16812
> URL: https://issues.apache.org/jira/browse/HBASE-16812
> Project: HBase
>  Issue Type: Task
>Reporter: Appy
>Assignee: Appy
>Priority: Minor
> Attachments: HBASE-16812.master.001.patch
>
>
> compact(CompactionContext compaction, CompactionThroughputController 
> throughputController) is [deprecated in 1.2.0 
> release|https://github.com/apache/hbase/blob/rel/1.2.0/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java#L222].
> Store.java is also marked limited private.
> Context: I was cleaning up zk table lock which is also used in that method's 
> [override|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HMobStore.java#L460]
>  in HMobStore.
> This method isn't being called from anywhere except CompactionTool (which 
> creates HStore object, not HMobStore object).
> [~jingcheng...@intel.com] Can you PTAL and help me understand what's going on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570534#comment-15570534
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 2:12 AM:


[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to configure HBase cluster. From 
your log file, we  see that LocalFileSystem is used instead of hdfs, that looks 
like your own configuration problem.  If you have cluster, go to any node to 
HBASE_HOME/bin and run backup command. If it fails, then you will not be able 
to run any hbase command, including 'shell'. Can you run "hbase shell" and 
connect to cluster?





was (Author: vrodionov):
[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you will not be able to run any hbase command, including 
'shell'. Can you run "hbase shell" and connect to cluster?




> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental back

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570534#comment-15570534
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 2:11 AM:


[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you will not be able to run any hbase command, including 
'shell'. Can you run "hbase shell" and connect to cluster?





was (Author: vrodionov):
[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you will not be able to run any hbase command, including 
'shell'. 




> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the ful

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570534#comment-15570534
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 2:00 AM:


[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you will not be able to run any hbase command, including 
'shell'. 





was (Author: vrodionov):
[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you are will not be able to run any hbase command, including 
'shell'. 




> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> 

[jira] [Created] (HBASE-16821) Enhance LoadIncrementalHFiles to convey missing hfiles if any

2016-10-12 Thread Ted Yu (JIRA)
Ted Yu created HBASE-16821:
--

 Summary: Enhance LoadIncrementalHFiles to convey missing hfiles if 
any
 Key: HBASE-16821
 URL: https://issues.apache.org/jira/browse/HBASE-16821
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu


When map parameter of run() method is not null:
{code}
  public int run(String dirPath, Map> map, TableName 
tableName) throws Exception{
{code}
the caller knows the exact files to be bulk loaded.

This issue is to enhance the run() API so that when certain hfiles turn out to 
be missing, the return value should indicate the missing files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16749) HBase root pom.xml contains repo from people.apache.org/~garyh

2016-10-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570549#comment-15570549
 ] 

Hudson commented on HBASE-16749:


SUCCESS: Integrated in Jenkins build HBase-1.2-JDK7 #46 (See 
[https://builds.apache.org/job/HBase-1.2-JDK7/46/])
HBASE-16749 Remove defunct repo from people.apache.org (garyh: rev 
211559e6d012b52b3e0fc8ecbd1a71856c9ae06c)
* (edit) pom.xml


> HBase root pom.xml contains repo from people.apache.org/~garyh
> --
>
> Key: HBASE-16749
> URL: https://issues.apache.org/jira/browse/HBASE-16749
> Project: HBase
>  Issue Type: Task
>Reporter: Wangda Tan
>Assignee: Gary Helmling
> Fix For: 1.2.5, 1.1.8
>
> Attachments: HBASE-16749.branch-1.1.patch, 
> HBASE-16749.branch-1.2.patch, HBASE-16749.branch-1.2.patch
>
>
> This is found when I was building Hadoop/YARN:
> {code}
> [INFO] 
> 
> [INFO] Building Apache Hadoop YARN Timeline Service 3.0.0-alpha2-SNAPSHOT
> [INFO] 
> 
> Downloading: 
> http://conjars.org/repo/org/apache/hadoop/hadoop-client/3.0.0-alpha2-SNAPSHOT/maven-metadata.xml
> ...
> Downloading: 
> http://people.apache.org/~garyh/mvn/org/apache/hadoop/hadoop-client/3.0.0-alpha2-SNAPSHOT/maven-metadata.xml
> ...
> Downloading: 
> http://people.apache.org/~garyh/mvn/org/apache/hadoop/hadoop-client/3.0.0-alpha2-SNAPSHOT/maven-metadata.xml
> {code}
> [~te...@apache.org] mentioned:
> bq. Among hbase releases (1.1.x, 1.2.y), root pom.xml contains 
> "ghelmling.testing" repo. 
> This private repo sometimes extends the Hadoop build time a lot. It might be 
> better to use ASF snapshot repo first before downloading from private repo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570534#comment-15570534
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 1:58 AM:


[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you are will not be able to run any hbase command, including 
'shell'. 





was (Author: vrodionov):
[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you are not able to run any hbase command, including 'shell'. 




> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570534#comment-15570534
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 1:56 AM:


[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you are not able to run any hbase command, including 'shell'. 





was (Author: vrodionov):
[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 




> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups 

[jira] [Commented] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570534#comment-15570534
 ] 

Vladimir Rodionov commented on HBASE-7912:
--

[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 




> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backup. 
> We would support the following key features and capabilities.
> * Full backup uses HBase snapshot to capture HFiles.
> * Use HBase WALs to capture incremental changes, but we use bulk load of 
> HFiles for fast incremental restore.
> * Support single table or a set of tables, and column family level backup and 
> restore.
> * Restore to different table names.
> * Support adding additional tables or CF to backup set without interruption 
> o

[jira] [Commented] (HBASE-16578) Mob data loss after mob compaction and normal compcation

2016-10-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570525#comment-15570525
 ] 

Ted Yu commented on HBASE-16578:


+1 if tests pass.

> Mob data loss after mob compaction and normal compcation
> 
>
> Key: HBASE-16578
> URL: https://issues.apache.org/jira/browse/HBASE-16578
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.0.0
>Reporter: huaxiang sun
>Assignee: Jingcheng Du
> Attachments: HBASE-16578-V2.patch, HBASE-16578.patch, 
> TestMobCompaction.java, TestMobCompaction.java
>
>
> StoreFileScanners on MOB cells rely on the scannerOrder to find the latest 
> cells after mob compaction. The value of scannerOrder is assigned by the 
> order of maxSeqId of StoreFile, and this maxSeqId is valued only after the 
> reader of the StoreFile is created.
> In {{Compactor.compact}}, the compacted store files are cloned and their 
> readers are not created. And in {{StoreFileScanner.getScannersForStoreFiles}} 
> the StoreFiles are sorted before the readers are created and at that time the 
> maxSeqId for each file is -1 (the default value). This will lead  to a chaos 
> in scanners in the following normal compaction. Some older cells might be 
> chosen during the normal compaction.
> We need to create readers either before the sorting in the method 
> {{StoreFileScanner.getScannersForStoreFiles}}, or create readers just after 
> the store files are cloned in {{Compactor.compact}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16653) Backport HBASE-11393 to all branches which support namespace

2016-10-12 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570524#comment-15570524
 ] 

Guanghao Zhang commented on HBASE-16653:


ReplicationPeers, ReplicationPeersZKImpl, ReplicationPeerZKImpl are 
InterfaceAudience.Private. So I keep the changes in v3 patch. But 
ReplicationPeerConfig class is InterfaceAudience.Public and v3 patch added two 
methods for it. Did this has compatibility issues?

> Backport HBASE-11393 to all branches which support namespace
> 
>
> Key: HBASE-16653
> URL: https://issues.apache.org/jira/browse/HBASE-16653
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0, 1.0.5, 1.3.1, 0.98.22, 1.1.7, 1.2.4
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 1.4.0
>
> Attachments: HBASE-16653-branch-1-v1.patch, 
> HBASE-16653-branch-1-v2.patch, HBASE-16653-branch-1-v3.patch
>
>
> As HBASE-11386 mentioned, the parse code about replication table-cfs config 
> will be wrong when table name contains namespace and we can only config the 
> default namespace's tables in the peer. It is a bug for all branches which 
> support namespace. HBASE-11393 resolved this by use a pb object but it was 
> only merged to master branch. Other branches still have this problem. I 
> thought we should fix this bug in all branches which support namespace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16816) HMaster.move() should throw exception if region to move is not online

2016-10-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570522#comment-15570522
 ] 

Ted Yu commented on HBASE-16816:


Can you detect region online without sleeping ?
{code}
160//wait region online
161Thread.sleep(1000);
{code}

> HMaster.move() should throw exception if region to move is not online
> -
>
> Key: HBASE-16816
> URL: https://issues.apache.org/jira/browse/HBASE-16816
> Project: HBase
>  Issue Type: Bug
>  Components: Admin
>Affects Versions: 1.1.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Minor
> Attachments: HBASE-16816-branch-1-v2.patch, HBASE-16816-branch-1.patch
>
>
> The move region function in HMaster only checks whether the region to move 
> exists
> {code}
> if (regionState == null) {
>   throw new 
> UnknownRegionException(Bytes.toStringBinary(encodedRegionName));
> }
> {code}
> It will not return anything if the region is split or in transition which is 
> not movable. So the caller has no way to know if the move region operation is 
> failed.
> It is a problem for "region_move.rb". It only gives up moving a region if a 
> exception is thrown.Otherwise, it will wait until a timeout and retry. 
> Without a exception, it have no idea the region is not movable.
> {code}
> begin
>   admin.move(Bytes.toBytes(r.getEncodedName()), Bytes.toBytes(newServer))
> rescue java.lang.reflect.UndeclaredThrowableException,
> org.apache.hadoop.hbase.UnknownRegionException => e
>   $LOG.info("Exception moving "  + r.getEncodedName() +
> "; split/moved? Continuing: " + e)
>   return
> end
>  # Wait till its up on new server before moving on
> maxWaitInSeconds = admin.getConfiguration.getInt("hbase.move.wait.max", 
> 60)
> maxWait = Time.now + maxWaitInSeconds
> while Time.now < maxWait
>   same = isSameServer(admin, r, original)
>   break unless same
>   sleep 0.1
> end
>   end
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16812) Cleanup deprecated compact() function

2016-10-12 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570519#comment-15570519
 ] 

Appy commented on HBASE-16812:
--

I tried to find the locks for CompactionTool and ExpiredMobFileCleaner but 
couldn't find it. Can you please point them to me, maybe github links. Thanks.


> Cleanup deprecated compact() function
> -
>
> Key: HBASE-16812
> URL: https://issues.apache.org/jira/browse/HBASE-16812
> Project: HBase
>  Issue Type: Task
>Reporter: Appy
>Assignee: Appy
>Priority: Minor
> Attachments: HBASE-16812.master.001.patch
>
>
> compact(CompactionContext compaction, CompactionThroughputController 
> throughputController) is [deprecated in 1.2.0 
> release|https://github.com/apache/hbase/blob/rel/1.2.0/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java#L222].
> Store.java is also marked limited private.
> Context: I was cleaning up zk table lock which is also used in that method's 
> [override|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HMobStore.java#L460]
>  in HMobStore.
> This method isn't being called from anywhere except CompactionTool (which 
> creates HStore object, not HMobStore object).
> [~jingcheng...@intel.com] Can you PTAL and help me understand what's going on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16578) Mob data loss after mob compaction and normal compcation

2016-10-12 Thread Jingcheng Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingcheng Du updated HBASE-16578:
-
Attachment: HBASE-16578-V2.patch

Upload a new patch V2 to address Ted's comments.

> Mob data loss after mob compaction and normal compcation
> 
>
> Key: HBASE-16578
> URL: https://issues.apache.org/jira/browse/HBASE-16578
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.0.0
>Reporter: huaxiang sun
>Assignee: Jingcheng Du
> Attachments: HBASE-16578-V2.patch, HBASE-16578.patch, 
> TestMobCompaction.java, TestMobCompaction.java
>
>
> StoreFileScanners on MOB cells rely on the scannerOrder to find the latest 
> cells after mob compaction. The value of scannerOrder is assigned by the 
> order of maxSeqId of StoreFile, and this maxSeqId is valued only after the 
> reader of the StoreFile is created.
> In {{Compactor.compact}}, the compacted store files are cloned and their 
> readers are not created. And in {{StoreFileScanner.getScannersForStoreFiles}} 
> the StoreFiles are sorted before the readers are created and at that time the 
> maxSeqId for each file is -1 (the default value). This will lead  to a chaos 
> in scanners in the following normal compaction. Some older cells might be 
> chosen during the normal compaction.
> We need to create readers either before the sorting in the method 
> {{StoreFileScanner.getScannersForStoreFiles}}, or create readers just after 
> the store files are cloned in {{Compactor.compact}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16816) HMaster.move() should throw exception if region to move is not online

2016-10-12 Thread Allan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-16816:
---
Attachment: HBASE-16816-branch-1-v2.patch

fix TestWarmupRegion UT fails

> HMaster.move() should throw exception if region to move is not online
> -
>
> Key: HBASE-16816
> URL: https://issues.apache.org/jira/browse/HBASE-16816
> Project: HBase
>  Issue Type: Bug
>  Components: Admin
>Affects Versions: 1.1.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Minor
> Attachments: HBASE-16816-branch-1-v2.patch, HBASE-16816-branch-1.patch
>
>
> The move region function in HMaster only checks whether the region to move 
> exists
> {code}
> if (regionState == null) {
>   throw new 
> UnknownRegionException(Bytes.toStringBinary(encodedRegionName));
> }
> {code}
> It will not return anything if the region is split or in transition which is 
> not movable. So the caller has no way to know if the move region operation is 
> failed.
> It is a problem for "region_move.rb". It only gives up moving a region if a 
> exception is thrown.Otherwise, it will wait until a timeout and retry. 
> Without a exception, it have no idea the region is not movable.
> {code}
> begin
>   admin.move(Bytes.toBytes(r.getEncodedName()), Bytes.toBytes(newServer))
> rescue java.lang.reflect.UndeclaredThrowableException,
> org.apache.hadoop.hbase.UnknownRegionException => e
>   $LOG.info("Exception moving "  + r.getEncodedName() +
> "; split/moved? Continuing: " + e)
>   return
> end
>  # Wait till its up on new server before moving on
> maxWaitInSeconds = admin.getConfiguration.getInt("hbase.move.wait.max", 
> 60)
> maxWait = Time.now + maxWaitInSeconds
> while Time.now < maxWait
>   same = isSameServer(admin, r, original)
>   break unless same
>   sleep 0.1
> end
>   end
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16815) Low scan ratio in RPC queue tuning triggers divide by zero exception

2016-10-12 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570507#comment-15570507
 ] 

Guanghao Zhang commented on HBASE-16815:


If queue's number is 0, skip startHandler() maybe better?
{noformat}
2016-10-12 12:50:27,311 DEBUG [main] ipc.RWQueueRpcExecutor: FifoRWQ.default 
writeQueues=7 writeHandlers=15 readQueues=8 readHandlers=14
{noformat}
I thought that we can change log level to INFO and log scan queues number, scan 
handlers count, too.

> Low scan ratio in RPC queue tuning triggers divide by zero exception
> 
>
> Key: HBASE-16815
> URL: https://issues.apache.org/jira/browse/HBASE-16815
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, rpc
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Lars George
>
> Trying the following settings:
> {noformat}
> 
>   hbase.ipc.server.callqueue.handler.factor
>   0.5
> 
> 
>   hbase.ipc.server.callqueue.read.ratio
>   0.5
> 
> 
>   hbase.ipc.server.callqueue.scan.ratio
>   0.1
> 
> {noformat}
> With 30 default handlers, this means 15 queues. Further, it means 8 write 
> queues and 7 read queues. 10% of that is {{0.7}} which is then floor'ed to 
> {{0}}. The debug log confirms it, as the tertiary check omits the scan 
> details when they are zero:
> {noformat}
> 2016-10-12 12:50:27,305 INFO  [main] ipc.SimpleRpcScheduler: Using fifo as 
> user call queue, count=15
> 2016-10-12 12:50:27,311 DEBUG [main] ipc.RWQueueRpcExecutor: FifoRWQ.default 
> writeQueues=7 writeHandlers=15 readQueues=8 readHandlers=14
> {noformat}
> But the code in {{RWQueueRpcExecutor}} calls {{RpcExecutor.startHandler()}} 
> nevertheless and that does this:
> {code}
> for (int i = 0; i < numHandlers; i++) {
>   final int index = qindex + (i % qsize);
>   String name = "RpcServer." + threadPrefix + ".handler=" + 
> handlers.size() + ",queue=" +
>   index + ",port=" + port;
> {code}
> The modulo triggers then 
> {noformat}
> 2016-10-12 11:41:22,810 ERROR [main] master.HMasterCommandLine: Master exiting
> java.lang.RuntimeException: Failed construction of Master: class 
> org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster
> at 
> org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:145)
> at 
> org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:220)
> at 
> org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:155)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:222)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:137)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2524)
> Caused by: java.lang.ArithmeticException: / by zero
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.startHandlers(RpcExecutor.java:125)
> at 
> org.apache.hadoop.hbase.ipc.RWQueueRpcExecutor.startHandlers(RWQueueRpcExecutor.java:178)
> at org.apache.hadoop.hbase.ipc.RpcExecutor.start(RpcExecutor.java:78)
> at 
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.start(SimpleRpcScheduler.java:272)
> at org.apache.hadoop.hbase.ipc.RpcServer.start(RpcServer.java:2212)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.start(RSRpcServices.java:1143)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:615)
> at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:396)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster.(HMasterCommandLine.java:312)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> at 
> org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:140)
> ... 7 more
> {noformat}
> That causes the server to not even start. I would suggest we either skip the 
> {{startHandler()}} call altogether, or make it zero aware.
> Another possible option is to reserve at least _one_ scan handler/queue when 
> the scan ratio is greater than zero, but only of there is more than one read 
> handler/queue to begin with. Otherwise the scan handler/queue should be zero 
> and share the one read handler/queue.
> Makes sense?



--
This message was sent by Atlass

[jira] [Updated] (HBASE-16786) Procedure V2 - Move ZK-lock's uses to Procedure framework locks (LockProcedure)

2016-10-12 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-16786:
-
Attachment: HBASE-16786.master.002.patch

> Procedure V2 - Move ZK-lock's uses to Procedure framework locks 
> (LockProcedure)
> ---
>
> Key: HBASE-16786
> URL: https://issues.apache.org/jira/browse/HBASE-16786
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Appy
>Assignee: Appy
> Attachments: HBASE-16786.master.001.patch, 
> HBASE-16786.master.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16812) Cleanup deprecated compact() function

2016-10-12 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570482#comment-15570482
 ] 

Jingcheng Du commented on HBASE-16812:
--

Thanks a lot [~appy]. I can take this JIRA to cleanup the deprecated compact() 
method.
bq. If compaction tool triggers compact, then no locks are being taken.
Actually the lock is taken in CompactionTool, because we do this in 
{{DefaultMobStoreCompactor.compact}}.
Now we don't have mob compaction in CompactionTool, but how about adding it in 
another jira, and only do cleanup in this jira?
bq. If ExpiredMobFileCleaner is triggered from command line, then no locks are 
taken.
We take read lock here too to synchronize this operation with mob and major 
compaction.

> Cleanup deprecated compact() function
> -
>
> Key: HBASE-16812
> URL: https://issues.apache.org/jira/browse/HBASE-16812
> Project: HBase
>  Issue Type: Task
>Reporter: Appy
>Assignee: Appy
>Priority: Minor
> Attachments: HBASE-16812.master.001.patch
>
>
> compact(CompactionContext compaction, CompactionThroughputController 
> throughputController) is [deprecated in 1.2.0 
> release|https://github.com/apache/hbase/blob/rel/1.2.0/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java#L222].
> Store.java is also marked limited private.
> Context: I was cleaning up zk table lock which is also used in that method's 
> [override|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HMobStore.java#L460]
>  in HMobStore.
> This method isn't being called from anywhere except CompactionTool (which 
> creates HStore object, not HMobStore object).
> [~jingcheng...@intel.com] Can you PTAL and help me understand what's going on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16752) Upgrading from 1.2 to 1.3 can lead to replication failures due to difference in RPC size limit

2016-10-12 Thread Ashu Pachauri (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashu Pachauri updated HBASE-16752:
--
Attachment: HBASE-16752.V2.patch

V2: Close the connection after making sure we send the response for the 
offending rpc to the client. Also, adding javadoc for the 
RequestTooBigException.

> Upgrading from 1.2 to 1.3 can lead to replication failures due to difference 
> in RPC size limit
> --
>
> Key: HBASE-16752
> URL: https://issues.apache.org/jira/browse/HBASE-16752
> Project: HBase
>  Issue Type: Bug
>  Components: Replication, rpc
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Ashu Pachauri
>Assignee: Ashu Pachauri
> Attachments: HBASE-16752.V1.patch, HBASE-16752.V2.patch
>
>
> In HBase 1.2, we don't limit size of a single RPC but in 1.3 we limit it by 
> default to 256 MB.  This means that during upgrade scenarios (or when source 
> is 1.2 peer is already on 1.3), it's possible to encounter a situation where 
> we try to send an rpc with size greater than 256 MB because we never unroll a 
> WALEdit while sending replication traffic.
> RpcServer throws the underlying exception locally, but closes the connection 
> with returning the underlying error to the client, and client only sees a 
> "Broken pipe" error.
> I am not sure what is the proper fix here (or if one is needed) to make sure 
> this does not happen, but we should return the underlying exception to the 
> RpcClient, because without it, it can be difficult to diagnose the problem, 
> especially for someone new to HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16578) Mob data loss after mob compaction and normal compcation

2016-10-12 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570444#comment-15570444
 ] 

Jingcheng Du commented on HBASE-16578:
--

Thanks a lot for review [~te...@apache.org]! I will upload a patch to address 
this.
bq. The original test is superseded by the new test ?
Yes, the new test support the old one.

> Mob data loss after mob compaction and normal compcation
> 
>
> Key: HBASE-16578
> URL: https://issues.apache.org/jira/browse/HBASE-16578
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.0.0
>Reporter: huaxiang sun
>Assignee: Jingcheng Du
> Attachments: HBASE-16578.patch, TestMobCompaction.java, 
> TestMobCompaction.java
>
>
> StoreFileScanners on MOB cells rely on the scannerOrder to find the latest 
> cells after mob compaction. The value of scannerOrder is assigned by the 
> order of maxSeqId of StoreFile, and this maxSeqId is valued only after the 
> reader of the StoreFile is created.
> In {{Compactor.compact}}, the compacted store files are cloned and their 
> readers are not created. And in {{StoreFileScanner.getScannersForStoreFiles}} 
> the StoreFiles are sorted before the readers are created and at that time the 
> maxSeqId for each file is -1 (the default value). This will lead  to a chaos 
> in scanners in the following normal compaction. Some older cells might be 
> chosen during the normal compaction.
> We need to create readers either before the sorting in the method 
> {{StoreFileScanner.getScannersForStoreFiles}}, or create readers just after 
> the store files are cloned in {{Compactor.compact}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16721) Concurrency issue in WAL unflushed seqId tracking

2016-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570404#comment-15570404
 ] 

Hadoop QA commented on HBASE-16721:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 2m 39s 
{color} | {color:red} root in branch-1.1 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 6s 
{color} | {color:red} hbase-server in branch-1.1 failed with JDK v1.8.0_101. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 3m 45s 
{color} | {color:red} hbase-server in branch-1.1 failed with JDK v1.7.0_80. 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
9s {color} | {color:green} branch-1.1 passed {color} |
| {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red} 0m 14s 
{color} | {color:red} hbase-server in branch-1.1 failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 6s 
{color} | {color:red} hbase-server in branch-1.1 failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 12s 
{color} | {color:red} hbase-server in branch-1.1 failed with JDK v1.8.0_101. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 6s 
{color} | {color:red} hbase-server in branch-1.1 failed with JDK v1.7.0_80. 
{color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 6s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 5s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_101. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 5s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.8.0_101. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 6s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_80. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 6s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.7.0_80. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
7s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red} 0m 6s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 55s {color} | {color:green} The patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:red}-1{color} | {color:red} hbaseprotoc {color} | {color:red} 0m 7s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 8s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 6s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_101. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 7s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_80. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 8s {color} | 
{color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
40s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 28s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:35e2245 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12833008/hbase-16721_addendum2.branch-1.1.patch
 |
| JIRA Issue | HBASE-16721 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 8f73a74263de 3.13.0-92-

[jira] [Commented] (HBASE-16752) Upgrading from 1.2 to 1.3 can lead to replication failures due to difference in RPC size limit

2016-10-12 Thread Ashu Pachauri (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570394#comment-15570394
 ] 

Ashu Pachauri commented on HBASE-16752:
---

[~ghelmling] Yeah you raise a valid point, we need to address the remaining 
part of the request on the channel in some way. Reading the garbage off the 
network just to discard the data sound like overkill. Even now we close the 
connection; with this change we'll have a better feedback to the client even if 
we do close the connection. So, I'll just modify the code to close the 
connection.

> Upgrading from 1.2 to 1.3 can lead to replication failures due to difference 
> in RPC size limit
> --
>
> Key: HBASE-16752
> URL: https://issues.apache.org/jira/browse/HBASE-16752
> Project: HBase
>  Issue Type: Bug
>  Components: Replication, rpc
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Ashu Pachauri
>Assignee: Ashu Pachauri
> Attachments: HBASE-16752.V1.patch
>
>
> In HBase 1.2, we don't limit size of a single RPC but in 1.3 we limit it by 
> default to 256 MB.  This means that during upgrade scenarios (or when source 
> is 1.2 peer is already on 1.3), it's possible to encounter a situation where 
> we try to send an rpc with size greater than 256 MB because we never unroll a 
> WALEdit while sending replication traffic.
> RpcServer throws the underlying exception locally, but closes the connection 
> with returning the underlying error to the client, and client only sees a 
> "Broken pipe" error.
> I am not sure what is the proper fix here (or if one is needed) to make sure 
> this does not happen, but we should return the underlying exception to the 
> RpcClient, because without it, it can be difficult to diagnose the problem, 
> especially for someone new to HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15560) TinyLFU-based BlockCache

2016-10-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570386#comment-15570386
 ] 

stack commented on HBASE-15560:
---

Thanks Ben. Let me try it here. Should be able in next day or so.

> TinyLFU-based BlockCache
> 
>
> Key: HBASE-15560
> URL: https://issues.apache.org/jira/browse/HBASE-15560
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Affects Versions: 2.0.0
>Reporter: Ben Manes
>Assignee: Ben Manes
> Attachments: HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, 
> HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, 
> tinylfu.patch
>
>
> LruBlockCache uses the Segmented LRU (SLRU) policy to capture frequency and 
> recency of the working set. It achieves concurrency by using an O( n ) 
> background thread to prioritize the entries and evict. Accessing an entry is 
> O(1) by a hash table lookup, recording its logical access time, and setting a 
> frequency flag. A write is performed in O(1) time by updating the hash table 
> and triggering an async eviction thread. This provides ideal concurrency and 
> minimizes the latencies by penalizing the thread instead of the caller. 
> However the policy does not age the frequencies and may not be resilient to 
> various workload patterns.
> W-TinyLFU ([research paper|http://arxiv.org/pdf/1512.00727.pdf]) records the 
> frequency in a counting sketch, ages periodically by halving the counters, 
> and orders entries by SLRU. An entry is discarded by comparing the frequency 
> of the new arrival (candidate) to the SLRU's victim, and keeping the one with 
> the highest frequency. This allows the operations to be performed in O(1) 
> time and, though the use of a compact sketch, a much larger history is 
> retained beyond the current working set. In a variety of real world traces 
> the policy had [near optimal hit 
> rates|https://github.com/ben-manes/caffeine/wiki/Efficiency].
> Concurrency is achieved by buffering and replaying the operations, similar to 
> a write-ahead log. A read is recorded into a striped ring buffer and writes 
> to a queue. The operations are applied in batches under a try-lock by an 
> asynchronous thread, thereby track the usage pattern without incurring high 
> latencies 
> ([benchmarks|https://github.com/ben-manes/caffeine/wiki/Benchmarks#server-class]).
> In YCSB benchmarks the results were inconclusive. For a large cache (99% hit 
> rates) the two caches have near identical throughput and latencies with 
> LruBlockCache narrowly winning. At medium and small caches, TinyLFU had a 
> 1-4% hit rate improvement and therefore lower latencies. The lack luster 
> result is because a synthetic Zipfian distribution is used, which SLRU 
> performs optimally. In a more varied, real-world workload we'd expect to see 
> improvements by being able to make smarter predictions.
> The provided patch implements BlockCache using the 
> [Caffeine|https://github.com/ben-manes/caffeine] caching library (see 
> HighScalability 
> [article|http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html]).
> Edward Bortnikov and Eshcar Hillel have graciously provided guidance for 
> evaluating this patch ([github 
> branch|https://github.com/ben-manes/hbase/tree/tinylfu]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570379#comment-15570379
 ] 

stack commented on HBASE-7912:
--

{code}
2016-10-12 17:19:49,538 DEBUG [AsyncRpcChannel-pool2-t13] ipc.AsyncRpcChannel: 
Use SIMPLE authentication for service ClientService, sasl=false
2016-10-12 17:19:49,554 ERROR [main] impl.FullTableBackupClient: Unexpected 
BackupException : Wrong FS: hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs, 
expected: file:///
java.lang.IllegalArgumentException: Wrong FS: 
hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:647)
at 
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:82)
at 
org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:425)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1515)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1555)
at org.apache.hadoop.fs.FileSystem$4.(FileSystem.java:1712)
at 
org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1711)
at 
org.apache.hadoop.fs.ChecksumFileSystem.listLocatedStatus(ChecksumFileSystem.java:589)
at org.apache.hadoop.fs.FileSystem$6.(FileSystem.java:1787)
at org.apache.hadoop.fs.FileSystem.listFiles(FileSystem.java:1783)
at 
org.apache.hadoop.hbase.backup.util.BackupClientUtil.getFiles(BackupClientUtil.java:161)
at 
org.apache.hadoop.hbase.backup.util.BackupServerUtil.getWALFilesOlderThan(BackupServerUtil.java:381)
at 
org.apache.hadoop.hbase.backup.impl.FullTableBackupClient.execute(FullTableBackupClient.java:492)
at 
org.apache.hadoop.hbase.backup.impl.HBaseBackupAdmin.backupTables(HBaseBackupAdmin.java:532)
at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:225)
at 
org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:114)
at 
org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:135)
at 
org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:171)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at 
org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:140)
2016-10-12 17:19:49,557 ERROR [main] impl.FullTableBackupClient: 
BackupId=backup_1476317988066,startts=1476317988637,failedts=1476317989557,failedphase=null,failedmessage=Wrong
 FS: hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs, expected: file:///
2016-10-12 17:19:49,559 DEBUG [AsyncRpcChannel-pool2-t14] ipc.AsyncRpcChannel: 
Use SIMPLE authentication for service ClientService, sasl=false
2016-10-12 17:19:49,656 DEBUG [main] ipc.AsyncRpcClient: Stopping async HBase 
RPC client
{code}

I pass the --config pointing at my conf dir which points at my hdfs and hbase. 
If I don't pass a --config, it'll use all defaults and not find the clusters 
(My config dir includes symlink to hdfs-site.xml).

Yeah, I've seen similar mismatch issues in the past in my own code (your google 
pointer is for a code writer, not for a 'user' like me).  I can bang my head 
and try 'fixing' it but am trying to convey a 'users' experience followiing 
instruction and tool usage. What is a little odd here is that the complaint is 
out of the backup tool, not about the arg I'm passing (must not be reading it 
immediately... because doesn't matter if I pass a file:/// or hdfs:/// scheme 
for backup location).

Let me know what you want me to try... 

This is straight cluster deploy. An Hadoop 2.7.3 build. All generally checks 
out.

> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move

[jira] [Updated] (HBASE-16813) Procedure v2 - Move ProcedureEvent to hbase-procedure module

2016-10-12 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-16813:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Procedure v2 - Move ProcedureEvent to hbase-procedure module
> 
>
> Key: HBASE-16813
> URL: https://issues.apache.org/jira/browse/HBASE-16813
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
> Fix For: 2.0.0
>
> Attachments: HBASE-16813-v0.patch
>
>
> ProcedureEvent was added in MasterProcedureScheduler, but it is generic 
> enough to move to hbase-procedure module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15560) TinyLFU-based BlockCache

2016-10-12 Thread Ben Manes (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570364#comment-15570364
 ] 

Ben Manes commented on HBASE-15560:
---

YCSB workload B states that each record is 1kb, so that is about 100mb (97mib). 
That's probably introduces some misses due to Java object overhead. Since 
LruBlockCache uses a high watermark and evicts to a low watermark, it could be 
aggressively under utilizing the capacity. So a higher hit rate might be 
understandable, in addition to the workload pattern's characteristics.

> TinyLFU-based BlockCache
> 
>
> Key: HBASE-15560
> URL: https://issues.apache.org/jira/browse/HBASE-15560
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Affects Versions: 2.0.0
>Reporter: Ben Manes
>Assignee: Ben Manes
> Attachments: HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, 
> HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, 
> tinylfu.patch
>
>
> LruBlockCache uses the Segmented LRU (SLRU) policy to capture frequency and 
> recency of the working set. It achieves concurrency by using an O( n ) 
> background thread to prioritize the entries and evict. Accessing an entry is 
> O(1) by a hash table lookup, recording its logical access time, and setting a 
> frequency flag. A write is performed in O(1) time by updating the hash table 
> and triggering an async eviction thread. This provides ideal concurrency and 
> minimizes the latencies by penalizing the thread instead of the caller. 
> However the policy does not age the frequencies and may not be resilient to 
> various workload patterns.
> W-TinyLFU ([research paper|http://arxiv.org/pdf/1512.00727.pdf]) records the 
> frequency in a counting sketch, ages periodically by halving the counters, 
> and orders entries by SLRU. An entry is discarded by comparing the frequency 
> of the new arrival (candidate) to the SLRU's victim, and keeping the one with 
> the highest frequency. This allows the operations to be performed in O(1) 
> time and, though the use of a compact sketch, a much larger history is 
> retained beyond the current working set. In a variety of real world traces 
> the policy had [near optimal hit 
> rates|https://github.com/ben-manes/caffeine/wiki/Efficiency].
> Concurrency is achieved by buffering and replaying the operations, similar to 
> a write-ahead log. A read is recorded into a striped ring buffer and writes 
> to a queue. The operations are applied in batches under a try-lock by an 
> asynchronous thread, thereby track the usage pattern without incurring high 
> latencies 
> ([benchmarks|https://github.com/ben-manes/caffeine/wiki/Benchmarks#server-class]).
> In YCSB benchmarks the results were inconclusive. For a large cache (99% hit 
> rates) the two caches have near identical throughput and latencies with 
> LruBlockCache narrowly winning. At medium and small caches, TinyLFU had a 
> 1-4% hit rate improvement and therefore lower latencies. The lack luster 
> result is because a synthetic Zipfian distribution is used, which SLRU 
> performs optimally. In a more varied, real-world workload we'd expect to see 
> improvements by being able to make smarter predictions.
> The provided patch implements BlockCache using the 
> [Caffeine|https://github.com/ben-manes/caffeine] caching library (see 
> HighScalability 
> [article|http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html]).
> Edward Bortnikov and Eshcar Hillel have graciously provided guidance for 
> evaluating this patch ([github 
> branch|https://github.com/ben-manes/hbase/tree/tinylfu]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16812) Cleanup deprecated compact() function

2016-10-12 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570355#comment-15570355
 ] 

Appy commented on HBASE-16812:
--

so looking around, we take locks from:
- mob compaction chore in master
- sweeper chore in master
- compaction from regions

If compaction tool triggers compact, then no locks are being taken.
(In fact, it's not even triggering mob compaction. [~jingcheng...@intel.com], 
can you please take on this jira and fix that.)

If ExpiredMobFileCleaner is triggered from command line, then no locks are 
taken.
Shouldn't we also take locks when these stuff are triggered from tools?


> Cleanup deprecated compact() function
> -
>
> Key: HBASE-16812
> URL: https://issues.apache.org/jira/browse/HBASE-16812
> Project: HBase
>  Issue Type: Task
>Reporter: Appy
>Assignee: Appy
>Priority: Minor
> Attachments: HBASE-16812.master.001.patch
>
>
> compact(CompactionContext compaction, CompactionThroughputController 
> throughputController) is [deprecated in 1.2.0 
> release|https://github.com/apache/hbase/blob/rel/1.2.0/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java#L222].
> Store.java is also marked limited private.
> Context: I was cleaning up zk table lock which is also used in that method's 
> [override|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HMobStore.java#L460]
>  in HMobStore.
> This method isn't being called from anywhere except CompactionTool (which 
> creates HStore object, not HMobStore object).
> [~jingcheng...@intel.com] Can you PTAL and help me understand what's going on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16752) Upgrading from 1.2 to 1.3 can lead to replication failures due to difference in RPC size limit

2016-10-12 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570353#comment-15570353
 ] 

Gary Helmling commented on HBASE-16752:
---

Please add javadoc for RequestTooBigException.

It occurs to me that we also need to do something with the remaining request 
data (the body) that is sitting in the socket channel.  Either we'll need to 
read it off into something like a garbage buffer and throw it away, or else 
maybe we should actually close the connection.  If we do close the connection, 
then does this change still buy us something?

> Upgrading from 1.2 to 1.3 can lead to replication failures due to difference 
> in RPC size limit
> --
>
> Key: HBASE-16752
> URL: https://issues.apache.org/jira/browse/HBASE-16752
> Project: HBase
>  Issue Type: Bug
>  Components: Replication, rpc
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Ashu Pachauri
>Assignee: Ashu Pachauri
> Attachments: HBASE-16752.V1.patch
>
>
> In HBase 1.2, we don't limit size of a single RPC but in 1.3 we limit it by 
> default to 256 MB.  This means that during upgrade scenarios (or when source 
> is 1.2 peer is already on 1.3), it's possible to encounter a situation where 
> we try to send an rpc with size greater than 256 MB because we never unroll a 
> WALEdit while sending replication traffic.
> RpcServer throws the underlying exception locally, but closes the connection 
> with returning the underlying error to the client, and client only sees a 
> "Broken pipe" error.
> I am not sure what is the proper fix here (or if one is needed) to make sure 
> this does not happen, but we should return the underlying exception to the 
> RpcClient, because without it, it can be difficult to diagnose the problem, 
> especially for someone new to HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16721) Concurrency issue in WAL unflushed seqId tracking

2016-10-12 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-16721:
--
Status: Patch Available  (was: Reopened)

> Concurrency issue in WAL unflushed seqId tracking
> -
>
> Key: HBASE-16721
> URL: https://issues.apache.org/jira/browse/HBASE-16721
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.2.0, 1.1.0, 1.0.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.4, 1.1.8
>
> Attachments: hbase-16721_addendum.patch, 
> hbase-16721_addendum2.branch-1.1.patch, hbase-16721_v1.branch-1.patch, 
> hbase-16721_v2.branch-1.patch, hbase-16721_v2.master.patch
>
>
> I'm inspecting an interesting case where in a production cluster, some 
> regionservers ends up accumulating hundreds of WAL files, even with force 
> flushes going on due to max logs. This happened multiple times on the 
> cluster, but not on other clusters. The cluster has periodic memstore flusher 
> disabled, however, this still does not explain why the force flush of regions 
> due to max limit is not working. I think the periodic memstore flusher just 
> masks the underlying problem, which is why we do not see this in other 
> clusters. 
> The problem starts like this: 
> {code}
> 2016-09-21 17:49:18,272 INFO  [regionserver//10.2.0.55:16020.logRoller] 
> wal.FSHLog: Too many wals: logs=33, maxlogs=32; forcing flush of 1 
> regions(s): d4cf39dc40ea79f5da4d0cf66d03cb1f
> 2016-09-21 17:49:18,273 WARN  [regionserver//10.2.0.55:16020.logRoller] 
> regionserver.LogRoller: Failed to schedule flush of 
> d4cf39dc40ea79f5da4d0cf66d03cb1f, region=null, requester=null
> {code}
> then, it continues until the RS is restarted: 
> {code}
> 2016-09-23 17:43:49,356 INFO  [regionserver//10.2.0.55:16020.logRoller] 
> wal.FSHLog: Too many wals: logs=721, maxlogs=32; forcing flush of 1 
> regions(s): d4cf39dc40ea79f5da4d0cf66d03cb1f
> 2016-09-23 17:43:49,357 WARN  [regionserver//10.2.0.55:16020.logRoller] 
> regionserver.LogRoller: Failed to schedule flush of 
> d4cf39dc40ea79f5da4d0cf66d03cb1f, region=null, requester=null
> {code}
> The problem is that region {{d4cf39dc40ea79f5da4d0cf66d03cb1f}} is already 
> split some time ago, and was able to flush its data and split without any 
> problems. However, the FSHLog still thinks that there is some unflushed data 
> for this region. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16721) Concurrency issue in WAL unflushed seqId tracking

2016-10-12 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-16721:
--
Attachment: hbase-16721_addendum2.branch-1.1.patch

This addendum is needed for 1.1 without a proper fix for HBASE-16820. It is 
basically undoing the HRegion changes, but adding a 
{{waitForPreviousTransactionsComplete()}} call at the beginning instead. We end 
up doing two mvcc transactions, but at the end of the flush, we are guaranteed 
to have the mvcc read point to be advanced to the flush sequence id. 

> Concurrency issue in WAL unflushed seqId tracking
> -
>
> Key: HBASE-16721
> URL: https://issues.apache.org/jira/browse/HBASE-16721
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.0, 1.1.0, 1.2.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.4, 1.1.8
>
> Attachments: hbase-16721_addendum.patch, 
> hbase-16721_addendum2.branch-1.1.patch, hbase-16721_v1.branch-1.patch, 
> hbase-16721_v2.branch-1.patch, hbase-16721_v2.master.patch
>
>
> I'm inspecting an interesting case where in a production cluster, some 
> regionservers ends up accumulating hundreds of WAL files, even with force 
> flushes going on due to max logs. This happened multiple times on the 
> cluster, but not on other clusters. The cluster has periodic memstore flusher 
> disabled, however, this still does not explain why the force flush of regions 
> due to max limit is not working. I think the periodic memstore flusher just 
> masks the underlying problem, which is why we do not see this in other 
> clusters. 
> The problem starts like this: 
> {code}
> 2016-09-21 17:49:18,272 INFO  [regionserver//10.2.0.55:16020.logRoller] 
> wal.FSHLog: Too many wals: logs=33, maxlogs=32; forcing flush of 1 
> regions(s): d4cf39dc40ea79f5da4d0cf66d03cb1f
> 2016-09-21 17:49:18,273 WARN  [regionserver//10.2.0.55:16020.logRoller] 
> regionserver.LogRoller: Failed to schedule flush of 
> d4cf39dc40ea79f5da4d0cf66d03cb1f, region=null, requester=null
> {code}
> then, it continues until the RS is restarted: 
> {code}
> 2016-09-23 17:43:49,356 INFO  [regionserver//10.2.0.55:16020.logRoller] 
> wal.FSHLog: Too many wals: logs=721, maxlogs=32; forcing flush of 1 
> regions(s): d4cf39dc40ea79f5da4d0cf66d03cb1f
> 2016-09-23 17:43:49,357 WARN  [regionserver//10.2.0.55:16020.logRoller] 
> regionserver.LogRoller: Failed to schedule flush of 
> d4cf39dc40ea79f5da4d0cf66d03cb1f, region=null, requester=null
> {code}
> The problem is that region {{d4cf39dc40ea79f5da4d0cf66d03cb1f}} is already 
> split some time ago, and was able to flush its data and split without any 
> problems. However, the FSHLog still thinks that there is some unflushed data 
> for this region. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16641) QA tests for hbase-client skip the second part.

2016-10-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570333#comment-15570333
 ] 

stack commented on HBASE-16641:
---

[~chenheng] Missed this. Does HBASE-16785 fix it?

> QA tests for hbase-client skip the second part.
> ---
>
> Key: HBASE-16641
> URL: https://issues.apache.org/jira/browse/HBASE-16641
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>
> See 
> https://builds.apache.org/job/PreCommit-HBASE-Build/3547/artifact/patchprocess/patch-unit-hbase-client.txt
> {code}
> [INFO] --- maven-surefire-plugin:2.18.1:test (secondPartTestsExecution) @ 
> hbase-client ---
> [INFO] Tests are skipped.
> {code}
> The first part passed fine,  but second parts is skipped. 
> Notice hbase-client/pom.xml 
> {code}
>  
>   
> secondPartTestsExecution
> test
> 
>   test
> 
> 
>   true
> 
>   
> 
> {code}
> If i change the 'skip' to be false,  the second part could be triggered.  But 
> this configuration existed for a long time,  is the cmd line on build box 
> updated recently? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HBASE-16721) Concurrency issue in WAL unflushed seqId tracking

2016-10-12 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar reopened HBASE-16721:
---

Reopening, since we have to do a quick addendum for 1.1. See HBASE-16820. 

> Concurrency issue in WAL unflushed seqId tracking
> -
>
> Key: HBASE-16721
> URL: https://issues.apache.org/jira/browse/HBASE-16721
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.0, 1.1.0, 1.2.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.4, 1.1.8
>
> Attachments: hbase-16721_addendum.patch, 
> hbase-16721_v1.branch-1.patch, hbase-16721_v2.branch-1.patch, 
> hbase-16721_v2.master.patch
>
>
> I'm inspecting an interesting case where in a production cluster, some 
> regionservers ends up accumulating hundreds of WAL files, even with force 
> flushes going on due to max logs. This happened multiple times on the 
> cluster, but not on other clusters. The cluster has periodic memstore flusher 
> disabled, however, this still does not explain why the force flush of regions 
> due to max limit is not working. I think the periodic memstore flusher just 
> masks the underlying problem, which is why we do not see this in other 
> clusters. 
> The problem starts like this: 
> {code}
> 2016-09-21 17:49:18,272 INFO  [regionserver//10.2.0.55:16020.logRoller] 
> wal.FSHLog: Too many wals: logs=33, maxlogs=32; forcing flush of 1 
> regions(s): d4cf39dc40ea79f5da4d0cf66d03cb1f
> 2016-09-21 17:49:18,273 WARN  [regionserver//10.2.0.55:16020.logRoller] 
> regionserver.LogRoller: Failed to schedule flush of 
> d4cf39dc40ea79f5da4d0cf66d03cb1f, region=null, requester=null
> {code}
> then, it continues until the RS is restarted: 
> {code}
> 2016-09-23 17:43:49,356 INFO  [regionserver//10.2.0.55:16020.logRoller] 
> wal.FSHLog: Too many wals: logs=721, maxlogs=32; forcing flush of 1 
> regions(s): d4cf39dc40ea79f5da4d0cf66d03cb1f
> 2016-09-23 17:43:49,357 WARN  [regionserver//10.2.0.55:16020.logRoller] 
> regionserver.LogRoller: Failed to schedule flush of 
> d4cf39dc40ea79f5da4d0cf66d03cb1f, region=null, requester=null
> {code}
> The problem is that region {{d4cf39dc40ea79f5da4d0cf66d03cb1f}} is already 
> split some time ago, and was able to flush its data and split without any 
> problems. However, the FSHLog still thinks that there is some unflushed data 
> for this region. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16749) HBase root pom.xml contains repo from people.apache.org/~garyh

2016-10-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570331#comment-15570331
 ] 

Hudson commented on HBASE-16749:


FAILURE: Integrated in Jenkins build HBase-1.1-JDK7 #1795 (See 
[https://builds.apache.org/job/HBase-1.1-JDK7/1795/])
HBASE-16749 Remove defunct repo from people.apache.org (garyh: rev 
edac95201492a135e9908d94597741b44ab6496f)
* (edit) pom.xml


> HBase root pom.xml contains repo from people.apache.org/~garyh
> --
>
> Key: HBASE-16749
> URL: https://issues.apache.org/jira/browse/HBASE-16749
> Project: HBase
>  Issue Type: Task
>Reporter: Wangda Tan
>Assignee: Gary Helmling
> Fix For: 1.2.5, 1.1.8
>
> Attachments: HBASE-16749.branch-1.1.patch, 
> HBASE-16749.branch-1.2.patch, HBASE-16749.branch-1.2.patch
>
>
> This is found when I was building Hadoop/YARN:
> {code}
> [INFO] 
> 
> [INFO] Building Apache Hadoop YARN Timeline Service 3.0.0-alpha2-SNAPSHOT
> [INFO] 
> 
> Downloading: 
> http://conjars.org/repo/org/apache/hadoop/hadoop-client/3.0.0-alpha2-SNAPSHOT/maven-metadata.xml
> ...
> Downloading: 
> http://people.apache.org/~garyh/mvn/org/apache/hadoop/hadoop-client/3.0.0-alpha2-SNAPSHOT/maven-metadata.xml
> ...
> Downloading: 
> http://people.apache.org/~garyh/mvn/org/apache/hadoop/hadoop-client/3.0.0-alpha2-SNAPSHOT/maven-metadata.xml
> {code}
> [~te...@apache.org] mentioned:
> bq. Among hbase releases (1.1.x, 1.2.y), root pom.xml contains 
> "ghelmling.testing" repo. 
> This private repo sometimes extends the Hadoop build time a lot. It might be 
> better to use ASF snapshot repo first before downloading from private repo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >