[jira] [Updated] (HBASE-18122) Scanner id should include ServerName of region server

2017-05-30 Thread Phil Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phil Yang updated HBASE-18122:
--
Fix Version/s: 1.1.11
   1.3.2
   1.2.6
   1.4.0
   2.0.0

> Scanner id should include ServerName of region server
> -
>
> Key: HBASE-18122
> URL: https://issues.apache.org/jira/browse/HBASE-18122
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0, 1.3.1, 1.2.5, 1.1.10
>Reporter: Phil Yang
>Assignee: Phil Yang
> Fix For: 2.0.0, 1.4.0, 1.2.6, 1.3.2, 1.1.11
>
> Attachments: HBASE-18122.v01.patch, HBASE-18122.v02.patch, 
> HBASE-18122.v03.patch, HBASE-18122.v04.patch
>
>
> Now the scanner id is a long number from 1 to max in a region server. Each 
> new scanner will have a scanner id.
> If a client has a scanner whose id is x, when the RS restart and the scanner 
> id is also incremented to x or a little larger, there will be a scanner id 
> collision.
> So the scanner id should now be same during each time the RS restart. We can 
> add the start timestamp as the highest several bits in scanner id uint64.
> And because HBASE-18121 is not easy to fix and there are many clients with 
> old version. We can also encode server host:port into the scanner id.
> So we can use ServerName.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18122) Scanner id should include ServerName of region server

2017-05-30 Thread Phil Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phil Yang updated HBASE-18122:
--
Affects Version/s: 1.4.0
   2.0.0
   1.3.1
   1.2.5
   1.1.10

> Scanner id should include ServerName of region server
> -
>
> Key: HBASE-18122
> URL: https://issues.apache.org/jira/browse/HBASE-18122
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0, 1.3.1, 1.2.5, 1.1.10
>Reporter: Phil Yang
>Assignee: Phil Yang
> Fix For: 2.0.0, 1.4.0, 1.2.6, 1.3.2, 1.1.11
>
> Attachments: HBASE-18122.v01.patch, HBASE-18122.v02.patch, 
> HBASE-18122.v03.patch, HBASE-18122.v04.patch
>
>
> Now the scanner id is a long number from 1 to max in a region server. Each 
> new scanner will have a scanner id.
> If a client has a scanner whose id is x, when the RS restart and the scanner 
> id is also incremented to x or a little larger, there will be a scanner id 
> collision.
> So the scanner id should now be same during each time the RS restart. We can 
> add the start timestamp as the highest several bits in scanner id uint64.
> And because HBASE-18121 is not easy to fix and there are many clients with 
> old version. We can also encode server host:port into the scanner id.
> So we can use ServerName.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18130) Refactor ReplicationSource

2017-05-30 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-18130:
---
Attachment: HBASE-18130.master.005.patch

> Refactor ReplicationSource
> --
>
> Key: HBASE-18130
> URL: https://issues.apache.org/jira/browse/HBASE-18130
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-18130.master.001.patch, 
> HBASE-18130.master.002.patch, HBASE-18130.master.003.patch, 
> HBASE-18130.master.004.patch, HBASE-18130.master.004.patch, 
> HBASE-18130.master.005.patch
>
>
> One basic idea is move the code about recovered queue to a new subclass 
> RecoveredReplicationSource. Then ReplicationSource will don't need call 
> isQueueRecovered many times. This will make the code more clearly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-16148) Hybrid Logical Clocks(placeholder for running tests)

2017-05-30 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030667#comment-16030667
 ] 

stack commented on HBASE-16148:
---

We failed early [~amit.patel] Lets chat...

> Hybrid Logical Clocks(placeholder for running tests)
> 
>
> Key: HBASE-16148
> URL: https://issues.apache.org/jira/browse/HBASE-16148
> Project: HBase
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sai Teja Ranuva
>Assignee: Sai Teja Ranuva
>Priority: Minor
>  Labels: test-patch
> Attachments: HBASE-16148.master.001.patch, 
> HBASE-16148.master.002.patch, HBASE-16148.master.003.patch, 
> HBASE-16148.master.004.patch, HBASE-16148.master.005.patch, 
> HBASE-16148.master.6.patch, HBASE-16148.master.test.1.patch, 
> HBASE-16148.master.test.2.patch, HBASE-16148.master.test.3.patch, 
> HBASE-16148.master.test.4.patch, HBASE-16148.master.test.5.patch, 
> HLC.10.1.patch, HLC.10.2.patch, HLC.10.3.patch, HLC.10.4.patch, 
> HLC.10.5.patch, HLC.10.6.patch, HLC.10.7.patch, HLC.10.patch, HLC.1.patch, 
> HLC.2.patch, HLC.3.patch, HLC.4.patch, HLC.5.patch, HLC.6.patch, HLC.8.patch, 
> HLC.9.patch, HLC.patch
>
>
> This JIRA is just a placeholder to test Hybrid Logical Clocks code.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14614) Procedure v2: Core Assignment Manager

2017-05-30 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14614:
--
Attachment: HBASE-14614.master.049.patch

> Procedure v2: Core Assignment Manager
> -
>
> Key: HBASE-14614
> URL: https://issues.apache.org/jira/browse/HBASE-14614
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0
>Reporter: Stephen Yuan Jiang
>Assignee: Matteo Bertozzi
> Fix For: 2.0.0
>
> Attachments: HBASE-14614.master.003.patch, 
> HBASE-14614.master.004.patch, HBASE-14614.master.005.patch, 
> HBASE-14614.master.006.patch, HBASE-14614.master.007.patch, 
> HBASE-14614.master.008.patch, HBASE-14614.master.009.patch, 
> HBASE-14614.master.010.patch, HBASE-14614.master.012.patch, 
> HBASE-14614.master.013.patch, HBASE-14614.master.014.patch, 
> HBASE-14614.master.015.patch, HBASE-14614.master.017.patch, 
> HBASE-14614.master.018.patch, HBASE-14614.master.019.patch, 
> HBASE-14614.master.020.patch, HBASE-14614.master.022.patch, 
> HBASE-14614.master.023.patch, HBASE-14614.master.024.patch, 
> HBASE-14614.master.025.patch, HBASE-14614.master.026.patch, 
> HBASE-14614.master.027.patch, HBASE-14614.master.028.patch, 
> HBASE-14614.master.029.patch, HBASE-14614.master.030.patch, 
> HBASE-14614.master.033.patch, HBASE-14614.master.038.patch, 
> HBASE-14614.master.039.patch, HBASE-14614.master.040.patch, 
> HBASE-14614.master.041.patch, HBASE-14614.master.042.patch, 
> HBASE-14614.master.043.patch, HBASE-14614.master.044.patch, 
> HBASE-14614.master.045.patch, HBASE-14614.master.045.patch, 
> HBASE-14614.master.046.patch, HBASE-14614.master.047.patch, 
> HBASE-14614.master.048.patch, HBASE-14614.master.049.patch
>
>
> New AssignmentManager implemented using proc-v2.
>  - AssignProcedure handle assignment operation
>  - UnassignProcedure handle unassign operation
>  - MoveRegionProcedure handle move/balance operation
> Concurrent Assign operations are batched together and sent to the balancer
> Concurrent Assign and Unassign operation ready to be sent to the RS are 
> batched together
> This patch is an intermediate state where we add the new AM as 
> AssignmentManager2() to the master, to be reached by tests. but the new AM 
> will not be integrated with the rest of the system. Only new am unit-tests 
> will exercise the new assigment manager. The integration with the master code 
> is part of HBASE-14616



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14614) Procedure v2: Core Assignment Manager

2017-05-30 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030661#comment-16030661
 ] 

stack commented on HBASE-14614:
---

Went over the failed unit tests and they all pass locally but for the 
TestExportSnapshot. Disabled it for now to look at later. All others pass 
locally. Uploading new patch.

> Procedure v2: Core Assignment Manager
> -
>
> Key: HBASE-14614
> URL: https://issues.apache.org/jira/browse/HBASE-14614
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0
>Reporter: Stephen Yuan Jiang
>Assignee: Matteo Bertozzi
> Fix For: 2.0.0
>
> Attachments: HBASE-14614.master.003.patch, 
> HBASE-14614.master.004.patch, HBASE-14614.master.005.patch, 
> HBASE-14614.master.006.patch, HBASE-14614.master.007.patch, 
> HBASE-14614.master.008.patch, HBASE-14614.master.009.patch, 
> HBASE-14614.master.010.patch, HBASE-14614.master.012.patch, 
> HBASE-14614.master.013.patch, HBASE-14614.master.014.patch, 
> HBASE-14614.master.015.patch, HBASE-14614.master.017.patch, 
> HBASE-14614.master.018.patch, HBASE-14614.master.019.patch, 
> HBASE-14614.master.020.patch, HBASE-14614.master.022.patch, 
> HBASE-14614.master.023.patch, HBASE-14614.master.024.patch, 
> HBASE-14614.master.025.patch, HBASE-14614.master.026.patch, 
> HBASE-14614.master.027.patch, HBASE-14614.master.028.patch, 
> HBASE-14614.master.029.patch, HBASE-14614.master.030.patch, 
> HBASE-14614.master.033.patch, HBASE-14614.master.038.patch, 
> HBASE-14614.master.039.patch, HBASE-14614.master.040.patch, 
> HBASE-14614.master.041.patch, HBASE-14614.master.042.patch, 
> HBASE-14614.master.043.patch, HBASE-14614.master.044.patch, 
> HBASE-14614.master.045.patch, HBASE-14614.master.045.patch, 
> HBASE-14614.master.046.patch, HBASE-14614.master.047.patch, 
> HBASE-14614.master.048.patch
>
>
> New AssignmentManager implemented using proc-v2.
>  - AssignProcedure handle assignment operation
>  - UnassignProcedure handle unassign operation
>  - MoveRegionProcedure handle move/balance operation
> Concurrent Assign operations are batched together and sent to the balancer
> Concurrent Assign and Unassign operation ready to be sent to the RS are 
> batched together
> This patch is an intermediate state where we add the new AM as 
> AssignmentManager2() to the master, to be reached by tests. but the new AM 
> will not be integrated with the rest of the system. Only new am unit-tests 
> will exercise the new assigment manager. The integration with the master code 
> is part of HBASE-14616



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18129) truncate_preserve fails when the truncate method doesn't exists on the master

2017-05-30 Thread Guangxu Cheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangxu Cheng updated HBASE-18129:
--
Attachment: HBASE-18129-master-v1.patch

> truncate_preserve fails when the truncate method doesn't exists on the master
> -
>
> Key: HBASE-18129
> URL: https://issues.apache.org/jira/browse/HBASE-18129
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0, 1.2.5
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
> Attachments: HBASE-18129-branch-1.patch, 
> HBASE-18129-branch-1-v1.patch.patch, HBASE-18129-branch-1-v2.patch, 
> HBASE-18129-branch-1-v3.patch, HBASE-18129-master.patch, 
> HBASE-18129-master-v1.patch
>
>
> Recently, I runs a rolling upgrade from HBase 0.98.x to HBase 1.2.5. During 
> the master hasn't been upgraded yet, I truncate a table by the command 
> truncate_preserve of 1.2.5, but failed.
> {code}
> hbase(main):001:0> truncate_preserve 'cf_logs'
> Truncating 'cf_logs' table (it may take a while):
>  - Disabling table...
>  - Truncating table...
>  - Dropping table...
>  - Creating table with region boundaries...
> ERROR: no method 'createTable' for arguments 
> (org.apache.hadoop.hbase.HTableDescriptor,org.jruby.java.proxies.ArrayJavaProxy)
>  on Java::OrgApacheHadoopHbaseClient::HBaseAdmin
> {code}
> After checking code and commit history, I found it's HBASE-12833 which causes 
> this bug.so we should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18122) Scanner id should include ServerName of region server

2017-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030658#comment-16030658
 ] 

Hadoop QA commented on HBASE-18122:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 27s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
35s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
42s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
20s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
34s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
38s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
24m 12s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 34m 55s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 84m 58s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | org.apache.hadoop.hbase.ipc.TestSimpleRpcScheduler |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.13.1 Server=1.13.1 Image:yetus/hbase:757bf37 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870504/HBASE-18122.v04.patch 
|
| JIRA Issue | HBASE-18122 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 05f2529a332f 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 
24 21:16:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / d547feac |
| Default Java | 1.8.0_131 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7013/artifact/patchprocess/patch-unit-hbase-server.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HBASE-Build/7013/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7013/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7013/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> 

[jira] [Commented] (HBASE-18027) Replication should respect RPC size limits when batching edits

2017-05-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030657#comment-16030657
 ] 

Hudson commented on HBASE-18027:


SUCCESS: Integrated in Jenkins build HBase-Trunk_matrix #3107 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/3107/])
HBASE-18027 HBaseInterClusterReplicationEndpoint should respect RPC (apurtell: 
rev d547feac6b673d59703e1a0ef46db38b26046e4c)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/HBaseInterClusterReplicationEndpoint.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALKey.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicator.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java


> Replication should respect RPC size limits when batching edits
> --
>
> Key: HBASE-18027
> URL: https://issues.apache.org/jira/browse/HBASE-18027
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.4.0, 1.3.1
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-18027-branch-1.patch, HBASE-18027-branch-1.patch, 
> HBASE-18027-branch-1.patch, HBASE-18027-branch-1.patch, HBASE-18027.patch, 
> HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch, 
> HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch
>
>
> In HBaseInterClusterReplicationEndpoint#replicate we try to replicate in 
> batches. We create N lists. N is the minimum of configured replicator 
> threads, number of 100-waledit batches, or number of current sinks. Every 
> pending entry in the replication context is then placed in order by hash of 
> encoded region name into one of these N lists. Each of the N lists is then 
> sent all at once in one replication RPC. We do not test if the sum of data in 
> each N list will exceed RPC size limits. This code presumes each individual 
> edit is reasonably small. Not checking for aggregate size while assembling 
> the lists into RPCs is an oversight and can lead to replication failure when 
> that assumption is violated.
> We can fix this by generating as many replication RPC calls as we need to 
> drain a list, keeping each RPC under limit, instead of assuming the whole 
> list will fit in one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18129) truncate_preserve fails when the truncate method doesn't exists on the master

2017-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030650#comment-16030650
 ] 

Hadoop QA commented on HBASE-18129:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 34s 
{color} | {color:red} Docker failed to build yetus/hbase:58c504e. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870512/HBASE-18129-branch-1-v3.patch
 |
| JIRA Issue | HBASE-18129 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7015/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> truncate_preserve fails when the truncate method doesn't exists on the master
> -
>
> Key: HBASE-18129
> URL: https://issues.apache.org/jira/browse/HBASE-18129
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0, 1.2.5
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
> Attachments: HBASE-18129-branch-1.patch, 
> HBASE-18129-branch-1-v1.patch.patch, HBASE-18129-branch-1-v2.patch, 
> HBASE-18129-branch-1-v3.patch, HBASE-18129-master.patch
>
>
> Recently, I runs a rolling upgrade from HBase 0.98.x to HBase 1.2.5. During 
> the master hasn't been upgraded yet, I truncate a table by the command 
> truncate_preserve of 1.2.5, but failed.
> {code}
> hbase(main):001:0> truncate_preserve 'cf_logs'
> Truncating 'cf_logs' table (it may take a while):
>  - Disabling table...
>  - Truncating table...
>  - Dropping table...
>  - Creating table with region boundaries...
> ERROR: no method 'createTable' for arguments 
> (org.apache.hadoop.hbase.HTableDescriptor,org.jruby.java.proxies.ArrayJavaProxy)
>  on Java::OrgApacheHadoopHbaseClient::HBaseAdmin
> {code}
> After checking code and commit history, I found it's HBASE-12833 which causes 
> this bug.so we should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18129) truncate_preserve fails when the truncate method doesn't exists on the master

2017-05-30 Thread Guangxu Cheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangxu Cheng updated HBASE-18129:
--
Attachment: HBASE-18129-branch-1-v3.patch

> truncate_preserve fails when the truncate method doesn't exists on the master
> -
>
> Key: HBASE-18129
> URL: https://issues.apache.org/jira/browse/HBASE-18129
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0, 1.2.5
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
> Attachments: HBASE-18129-branch-1.patch, 
> HBASE-18129-branch-1-v1.patch.patch, HBASE-18129-branch-1-v2.patch, 
> HBASE-18129-branch-1-v3.patch, HBASE-18129-master.patch
>
>
> Recently, I runs a rolling upgrade from HBase 0.98.x to HBase 1.2.5. During 
> the master hasn't been upgraded yet, I truncate a table by the command 
> truncate_preserve of 1.2.5, but failed.
> {code}
> hbase(main):001:0> truncate_preserve 'cf_logs'
> Truncating 'cf_logs' table (it may take a while):
>  - Disabling table...
>  - Truncating table...
>  - Dropping table...
>  - Creating table with region boundaries...
> ERROR: no method 'createTable' for arguments 
> (org.apache.hadoop.hbase.HTableDescriptor,org.jruby.java.proxies.ArrayJavaProxy)
>  on Java::OrgApacheHadoopHbaseClient::HBaseAdmin
> {code}
> After checking code and commit history, I found it's HBASE-12833 which causes 
> this bug.so we should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-15576) Scanning cursor to prevent blocking long time on ResultScanner.next()

2017-05-30 Thread Phil Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phil Yang updated HBASE-15576:
--
Attachment: HBASE-15576.v04.patch

Retry

> Scanning cursor to prevent blocking long time on ResultScanner.next()
> -
>
> Key: HBASE-15576
> URL: https://issues.apache.org/jira/browse/HBASE-15576
> Project: HBase
>  Issue Type: New Feature
>Reporter: Phil Yang
>Assignee: Phil Yang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-15576.v01.patch, HBASE-15576.v02.patch, 
> HBASE-15576.v03.patch, HBASE-15576.v03.patch, HBASE-15576.v04.patch, 
> HBASE-15576.v04.patch
>
>
> After 1.1.0 released, we have partial and heartbeat protocol in scanning to 
> prevent responding large data or timeout. Now for ResultScanner.next(), we 
> may block for longer time larger than timeout settings to get a Result if the 
> row is very large, or filter is sparse, or there are too many delete markers 
> in files.
> However, in some scenes, we don't want it to be blocked for too long. For 
> example, a web service which handles requests from mobile devices whose 
> network is not stable and we can not set timeout too long(eg. only 5 seconds) 
> between mobile and web service. This service will scan rows from HBase and 
> return it to mobile devices. In this scene, the simplest way is to make the 
> web service stateless. Apps in mobile devices will send several requests one 
> by one to get the data until enough just like paging a list. In each request 
> it will carry a start position which depends on the last result from web 
> service. Different requests can be sent to different web service server 
> because it is stateless.
> Therefore, the stateless web service need a cursor from HBase telling where 
> we have scanned in RegionScanner when HBase client receives an empty 
> heartbeat. And the service will return the cursor to mobile device although 
> the response has no data. In next request we can start at the position of 
> cursor, without the cursor we have to scan from last returned result and we 
> may timeout forever. And of course even if the heartbeat message is not empty 
> we can still use cursor to prevent re-scan the same rows/cells which has beed 
> skipped.
> Obviously, we will give up consistency for scanning because even HBase client 
> is also stateless, but it is acceptable in this scene. And maybe we can keep 
> mvcc in cursor so we can get a consistent view?
> HBASE-13099 had some discussion, but it has no further progress by now.
> API:
> In Scan we need a new method setNeedCursorResult(true) to get the cursor row 
> key when there is a RPC response but client can not return any Result. In 
> this mode we will not block ResultScanner.next() longer than this timeout 
> setting.
> {code}
> while (r = scanner.next() && r != null) {
>   if(r.isCursor()){
>   // scanning is not end, it is a cursor, save its row key and close scanner 
> if you want, or
>   // just continue the loop to call next().
>   } else {
>   // just like before
>   }
> }
> // scanning is end
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18122) Scanner id should include ServerName of region server

2017-05-30 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030634#comment-16030634
 ] 

Chia-Ping Tsai commented on HBASE-18122:


LGTM

> Scanner id should include ServerName of region server
> -
>
> Key: HBASE-18122
> URL: https://issues.apache.org/jira/browse/HBASE-18122
> Project: HBase
>  Issue Type: Bug
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: HBASE-18122.v01.patch, HBASE-18122.v02.patch, 
> HBASE-18122.v03.patch, HBASE-18122.v04.patch
>
>
> Now the scanner id is a long number from 1 to max in a region server. Each 
> new scanner will have a scanner id.
> If a client has a scanner whose id is x, when the RS restart and the scanner 
> id is also incremented to x or a little larger, there will be a scanner id 
> collision.
> So the scanner id should now be same during each time the RS restart. We can 
> add the start timestamp as the highest several bits in scanner id uint64.
> And because HBASE-18121 is not easy to fix and there are many clients with 
> old version. We can also encode server host:port into the scanner id.
> So we can use ServerName.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18095) Provide an option for clients to find the server hosting META that does not involve the ZooKeeper client

2017-05-30 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030628#comment-16030628
 ] 

Lars Hofhansl commented on HBASE-18095:
---

I do like dropping the ZK dependencies for regular clients, and make ZK more of 
an HBase-internal communication service.


> Provide an option for clients to find the server hosting META that does not 
> involve the ZooKeeper client
> 
>
> Key: HBASE-18095
> URL: https://issues.apache.org/jira/browse/HBASE-18095
> Project: HBase
>  Issue Type: New Feature
>  Components: Client
>Reporter: Andrew Purtell
>
> Clients are required to connect to ZooKeeper to find the location of the 
> regionserver hosting the meta table region. Site configuration provides the 
> client a list of ZK quorum peers and the client uses an embedded ZK client to 
> query meta location. Timeouts and retry behavior of this embedded ZK client 
> are managed orthogonally to HBase layer settings and in some cases the ZK 
> cannot manage what in theory the HBase client can, i.e. fail fast upon outage 
> or network partition.
> We should consider new configuration settings that provide a list of 
> well-known master and backup master locations, and with this information the 
> client can contact any of the master processes directly. Any master in either 
> active or passive state will track meta location and respond to requests for 
> it with its cached last known location. If this location is stale, the client 
> can ask again with a flag set that requests the master refresh its location 
> cache and return the up-to-date location. Every client interaction with the 
> cluster thus uses only HBase RPC as transport, with appropriate settings 
> applied to the connection. The configuration toggle that enables this 
> alternative meta location lookup should be false by default.
> This removes the requirement that HBase clients embed the ZK client and 
> contact the ZK service directly at the beginning of the connection lifecycle. 
> This has several benefits. ZK service need not be exposed to clients, and 
> their potential abuse, yet no benefit ZK provides the HBase server cluster is 
> compromised. Normalizing HBase client and ZK client timeout settings and 
> retry behavior - in some cases, impossible, i.e. for fail-fast - is no longer 
> necessary. 
> And, from [~ghelmling]: There is an additional complication here for 
> token-based authentication. When a delegation token is used for SASL 
> authentication, the client uses the cluster ID obtained from Zookeeper to 
> select the token identifier to use. So there would also need to be some 
> Zookeeper-less, unauthenticated way to obtain the cluster ID as well. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18122) Scanner id should include ServerName of region server

2017-05-30 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030620#comment-16030620
 ] 

Ted Yu commented on HBASE-18122:


lgtm

Do the timed out tests pass for you ?

> Scanner id should include ServerName of region server
> -
>
> Key: HBASE-18122
> URL: https://issues.apache.org/jira/browse/HBASE-18122
> Project: HBase
>  Issue Type: Bug
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: HBASE-18122.v01.patch, HBASE-18122.v02.patch, 
> HBASE-18122.v03.patch, HBASE-18122.v04.patch
>
>
> Now the scanner id is a long number from 1 to max in a region server. Each 
> new scanner will have a scanner id.
> If a client has a scanner whose id is x, when the RS restart and the scanner 
> id is also incremented to x or a little larger, there will be a scanner id 
> collision.
> So the scanner id should now be same during each time the RS restart. We can 
> add the start timestamp as the highest several bits in scanner id uint64.
> And because HBASE-18121 is not easy to fix and there are many clients with 
> old version. We can also encode server host:port into the scanner id.
> So we can use ServerName.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15302) Reenable the other tests disabled by HBASE-14678

2017-05-30 Thread Phil Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030600#comment-16030600
 ] 

Phil Yang commented on HBASE-15302:
---

Agree, let's close this issue and file a new one if needed(not sure what tests 
are still not reenable, need checking)

> Reenable the other tests disabled by HBASE-14678
> 
>
> Key: HBASE-15302
> URL: https://issues.apache.org/jira/browse/HBASE-15302
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: Phil Yang
>Assignee: Phil Yang
> Fix For: 2.0.0, 1.4.0, 1.3.2, 1.2.7
>
> Attachments: HBASE-15302-branch-1.3-append-v1.patch, 
> HBASE-15302-branch-1.3-append-v1.patch, HBASE-15302-branch-1-append-v1.patch, 
> HBASE-15302-branch-1-v1.patch, HBASE-15302-v1.txt, HBASE-15302-v1.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-18138) HBase named read caches

2017-05-30 Thread Biju Nair (JIRA)
Biju Nair created HBASE-18138:
-

 Summary: HBase named read caches
 Key: HBASE-18138
 URL: https://issues.apache.org/jira/browse/HBASE-18138
 Project: HBase
  Issue Type: New Feature
  Components: BlockCache, BucketCache
Reporter: Biju Nair


Instead of a single read(block) cache, if HBase can support creation of named 
read caches and use by tables it will help common scenarios like

- Assigning a chunk of the cache to tables with data which are critical to 
performance so that they don’t get swapped out due to other less critical table 
data being read
- To be able to guarantee a percentage of the cache to tenants in a multi 
tenant environment by assigning named caches to each tenant



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18122) Scanner id should include ServerName of region server

2017-05-30 Thread Phil Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030593#comment-16030593
 ] 

Phil Yang commented on HBASE-18122:
---

Comment is also fixed

> Scanner id should include ServerName of region server
> -
>
> Key: HBASE-18122
> URL: https://issues.apache.org/jira/browse/HBASE-18122
> Project: HBase
>  Issue Type: Bug
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: HBASE-18122.v01.patch, HBASE-18122.v02.patch, 
> HBASE-18122.v03.patch, HBASE-18122.v04.patch
>
>
> Now the scanner id is a long number from 1 to max in a region server. Each 
> new scanner will have a scanner id.
> If a client has a scanner whose id is x, when the RS restart and the scanner 
> id is also incremented to x or a little larger, there will be a scanner id 
> collision.
> So the scanner id should now be same during each time the RS restart. We can 
> add the start timestamp as the highest several bits in scanner id uint64.
> And because HBASE-18121 is not easy to fix and there are many clients with 
> old version. We can also encode server host:port into the scanner id.
> So we can use ServerName.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18122) Scanner id should include ServerName of region server

2017-05-30 Thread Phil Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phil Yang updated HBASE-18122:
--
Attachment: HBASE-18122.v04.patch

> Scanner id should include ServerName of region server
> -
>
> Key: HBASE-18122
> URL: https://issues.apache.org/jira/browse/HBASE-18122
> Project: HBase
>  Issue Type: Bug
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: HBASE-18122.v01.patch, HBASE-18122.v02.patch, 
> HBASE-18122.v03.patch, HBASE-18122.v04.patch
>
>
> Now the scanner id is a long number from 1 to max in a region server. Each 
> new scanner will have a scanner id.
> If a client has a scanner whose id is x, when the RS restart and the scanner 
> id is also incremented to x or a little larger, there will be a scanner id 
> collision.
> So the scanner id should now be same during each time the RS restart. We can 
> add the start timestamp as the highest several bits in scanner id uint64.
> And because HBASE-18121 is not easy to fix and there are many clients with 
> old version. We can also encode server host:port into the scanner id.
> So we can use ServerName.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18122) Scanner id should include ServerName of region server

2017-05-30 Thread Phil Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phil Yang updated HBASE-18122:
--
Attachment: (was: HBASE-18122.v04.patch)

> Scanner id should include ServerName of region server
> -
>
> Key: HBASE-18122
> URL: https://issues.apache.org/jira/browse/HBASE-18122
> Project: HBase
>  Issue Type: Bug
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: HBASE-18122.v01.patch, HBASE-18122.v02.patch, 
> HBASE-18122.v03.patch
>
>
> Now the scanner id is a long number from 1 to max in a region server. Each 
> new scanner will have a scanner id.
> If a client has a scanner whose id is x, when the RS restart and the scanner 
> id is also incremented to x or a little larger, there will be a scanner id 
> collision.
> So the scanner id should now be same during each time the RS restart. We can 
> add the start timestamp as the highest several bits in scanner id uint64.
> And because HBASE-18121 is not easy to fix and there are many clients with 
> old version. We can also encode server host:port into the scanner id.
> So we can use ServerName.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18122) Scanner id should include ServerName of region server

2017-05-30 Thread Phil Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phil Yang updated HBASE-18122:
--
Attachment: HBASE-18122.v04.patch

Change field to final and retry UT

> Scanner id should include ServerName of region server
> -
>
> Key: HBASE-18122
> URL: https://issues.apache.org/jira/browse/HBASE-18122
> Project: HBase
>  Issue Type: Bug
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: HBASE-18122.v01.patch, HBASE-18122.v02.patch, 
> HBASE-18122.v03.patch, HBASE-18122.v04.patch
>
>
> Now the scanner id is a long number from 1 to max in a region server. Each 
> new scanner will have a scanner id.
> If a client has a scanner whose id is x, when the RS restart and the scanner 
> id is also incremented to x or a little larger, there will be a scanner id 
> collision.
> So the scanner id should now be same during each time the RS restart. We can 
> add the start timestamp as the highest several bits in scanner id uint64.
> And because HBASE-18121 is not easy to fix and there are many clients with 
> old version. We can also encode server host:port into the scanner id.
> So we can use ServerName.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14614) Procedure v2: Core Assignment Manager

2017-05-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030583#comment-16030583
 ] 

Hudson commented on HBASE-14614:


FAILURE: Integrated in Jenkins build HBase-HBASE-14614 #255 (See 
[https://builds.apache.org/job/HBase-HBASE-14614/255/])
HBASE-14614 Procedure v2 - Core Assignment Manager (Matteo Bertozzi) (stack: 
rev b2925bc0fb613ec1d076183891d0dfc5b995e429)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestTableFavoredNodes.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentListener.java
* (delete) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStateStore.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java
* (add) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DispatchMergingRegionsProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcServer.java
* (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestServerBusyException.java
* (edit) 
hbase-protocol-shaded/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/generated/AccessControlProtos.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/namespace/TestNamespaceAuditor.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/MockMasterServices.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestFavoredStochasticLoadBalancer.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsckReplicas.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
* (edit) 
hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/AssignmentManagerStatusTmpl.jamon
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureSchedulerPerformanceEvaluation.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestAssignmentManager.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateTableProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureConstants.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
* (add) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AbstractStateMachineRegionProcedure.java
* (add) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/Util.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteColumnFamilyProcedure.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMaster.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMetaShutdownHandler.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin2.java
* (delete) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/UnAssignCallable.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ShortCircuitMasterConnection.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestCreateTableProcedure.java
* (add) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
* (edit) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureEvent.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureTestingUtility.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/TruncateTableProcedure.java
* (edit) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/store/NoopProcedureStore.java
* (edit) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureScheduler.java
* (add) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController3.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestBlockEvictionFromClient.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AbstractStateMachineTableProcedure.java
* (edit) 

[jira] [Updated] (HBASE-18132) Low replication should be checked in period in case of datanode rolling upgrade

2017-05-30 Thread Allan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-18132:
---
Attachment: HBASE-18132-branch-1.v4.patch

> Low replication should be checked in period in case of datanode rolling 
> upgrade
> ---
>
> Key: HBASE-18132
> URL: https://issues.apache.org/jira/browse/HBASE-18132
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0, 1.1.10
>Reporter: Allan Yang
>Assignee: Allan Yang
> Attachments: HBASE-18132-branch-1.patch, 
> HBASE-18132-branch-1.v2.patch, HBASE-18132-branch-1.v3.patch, 
> HBASE-18132-branch-1.v4.patch
>
>
> For now, we just check low replication of WALs when there is a sync operation 
> (HBASE-2234), rolling the log if the replica of the WAL is less than 
> configured. But if the WAL has very little writes or no writes at all, low 
> replication will not be detected and thus no log will be rolled. 
> That is a problem when rolling updating datanode, all replica of the WAL with 
> no writes will be restarted and lead to the WAL file end up with a abnormal 
> state. Later operation of opening this file will be always failed.
> I bring up a patch to check low replication of WALs at a configured period. 
> When rolling updating datanodes, we just make sure the restart interval time 
> between two nodes is bigger than the low replication check time, the WAL will 
> be closed and rolled normally. A UT in the patch will show everything.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-16392) Backup delete fault tolerance

2017-05-30 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030553#comment-16030553
 ] 

Ted Yu commented on HBASE-16392:


I searched for both messages (of the println) above in test output - no hit 
either.


> Backup delete fault tolerance
> -
>
> Key: HBASE-16392
> URL: https://issues.apache.org/jira/browse/HBASE-16392
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: HBASE-16392-v1.patch, HBASE-16392-v2.patch
>
>
> Backup delete modified file system and backup system table. We have to make 
> sure that operation is atomic, durable and isolated.
> Delete operation:
> # Start backup session (this guarantees) that system will be blocked for all 
> backup commands during delete operation
> # Save list of tables being deleted to system table
> # Before delete operation we take backup system table snapshot  
> # During delete operation we detect any failures and restore backup system 
> table from snapshot, then finish backup session
> # To guarantee consistency of the data, delete operation MUST be repeated
> # We guarantee that all file delete operations are idempotent, can be 
> repeated multiple times
> # Any backup operations will be blocked until consistency is restored
> # To restore consistency, repair command must be executed.
> # Repair command checks if there is failed delete op in a backup system 
> table, and repeats delete operation



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-16392) Backup delete fault tolerance

2017-05-30 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030549#comment-16030549
 ] 

Vladimir Rodionov commented on HBASE-16392:
---

{quote}
I ran TestRepairAfterFailedDelete and searched for "backup repair" / "repair 
tool" in the test output - no hit.
Mind telling me which message should be covered by the test ?
{quote}

Test emulates failure by modifying backup system table after backup. Then it 
runs repair command and checks system table again. That is why you do not see  
"backup repair" / "repair tool"  in test's output

> Backup delete fault tolerance
> -
>
> Key: HBASE-16392
> URL: https://issues.apache.org/jira/browse/HBASE-16392
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: HBASE-16392-v1.patch, HBASE-16392-v2.patch
>
>
> Backup delete modified file system and backup system table. We have to make 
> sure that operation is atomic, durable and isolated.
> Delete operation:
> # Start backup session (this guarantees) that system will be blocked for all 
> backup commands during delete operation
> # Save list of tables being deleted to system table
> # Before delete operation we take backup system table snapshot  
> # During delete operation we detect any failures and restore backup system 
> table from snapshot, then finish backup session
> # To guarantee consistency of the data, delete operation MUST be repeated
> # We guarantee that all file delete operations are idempotent, can be 
> repeated multiple times
> # Any backup operations will be blocked until consistency is restored
> # To restore consistency, repair command must be executed.
> # Repair command checks if there is failed delete op in a backup system 
> table, and repeats delete operation



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-16392) Backup delete fault tolerance

2017-05-30 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030548#comment-16030548
 ] 

Vladimir Rodionov commented on HBASE-16392:
---

Message is in Delete command (BackupCommands class):
{code}
@Override
public void execute() throws IOException {
  if (cmdline == null || cmdline.getArgs() == null || 
cmdline.getArgs().length < 2) {
printUsage();
throw new IOException(INCORRECT_USAGE);
  }

  super.execute();

  String[] args = cmdline.getArgs();
  String[] backupIds = new String[args.length - 1];
  System.arraycopy(args, 1, backupIds, 0, backupIds.length);
  try (BackupAdminImpl admin = new BackupAdminImpl(conn);) {
int deleted = admin.deleteBackups(backupIds);
System.out.println("Deleted " + deleted + " backups. Total requested: " 
+ args.length);
  } catch (IOException e) {
System.err.println("Delete command FAILED. Please run backup repair 
tool to restore backup system integrity");
throw e;
  }

}
{code}

> Backup delete fault tolerance
> -
>
> Key: HBASE-16392
> URL: https://issues.apache.org/jira/browse/HBASE-16392
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: HBASE-16392-v1.patch, HBASE-16392-v2.patch
>
>
> Backup delete modified file system and backup system table. We have to make 
> sure that operation is atomic, durable and isolated.
> Delete operation:
> # Start backup session (this guarantees) that system will be blocked for all 
> backup commands during delete operation
> # Save list of tables being deleted to system table
> # Before delete operation we take backup system table snapshot  
> # During delete operation we detect any failures and restore backup system 
> table from snapshot, then finish backup session
> # To guarantee consistency of the data, delete operation MUST be repeated
> # We guarantee that all file delete operations are idempotent, can be 
> repeated multiple times
> # Any backup operations will be blocked until consistency is restored
> # To restore consistency, repair command must be executed.
> # Repair command checks if there is failed delete op in a backup system 
> table, and repeats delete operation



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18132) Low replication should be checked in period in case of datanode rolling upgrade

2017-05-30 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030545#comment-16030545
 ] 

Ted Yu commented on HBASE-18132:


{code}
Errors were encountered while processing:
 oracle-java7-installer
{code}
Failure of installing java 7.

Attach master patch whose QA run wouldn't have this problem.

> Low replication should be checked in period in case of datanode rolling 
> upgrade
> ---
>
> Key: HBASE-18132
> URL: https://issues.apache.org/jira/browse/HBASE-18132
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0, 1.1.10
>Reporter: Allan Yang
>Assignee: Allan Yang
> Attachments: HBASE-18132-branch-1.patch, 
> HBASE-18132-branch-1.v2.patch, HBASE-18132-branch-1.v3.patch
>
>
> For now, we just check low replication of WALs when there is a sync operation 
> (HBASE-2234), rolling the log if the replica of the WAL is less than 
> configured. But if the WAL has very little writes or no writes at all, low 
> replication will not be detected and thus no log will be rolled. 
> That is a problem when rolling updating datanode, all replica of the WAL with 
> no writes will be restarted and lead to the WAL file end up with a abnormal 
> state. Later operation of opening this file will be always failed.
> I bring up a patch to check low replication of WALs at a configured period. 
> When rolling updating datanodes, we just make sure the restart interval time 
> between two nodes is bigger than the low replication check time, the WAL will 
> be closed and rolled normally. A UT in the patch will show everything.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18131) Add an hbase shell command to clear deadserver list in ServerManager

2017-05-30 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030535#comment-16030535
 ] 

Yu Li commented on HBASE-18131:
---

bq. I think the root cause of this is not that servers are in the dead-servers 
list indefinitely
Possibly, but the server will be left in dead-servers list (before master 
restarts) if it's stopped (like for hardware repair) on purpose, right? So I 
think the new command will still be a good tool in such scenario? Thanks.

> Add an hbase shell command to clear deadserver list in ServerManager
> 
>
> Key: HBASE-18131
> URL: https://issues.apache.org/jira/browse/HBASE-18131
> Project: HBase
>  Issue Type: New Feature
>  Components: Operability
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.4.0
>
>
> Currently if a regionserver is aborted due to fatal error or stopped by 
> operator on purpose, it will be added into {{ServerManager#deadservers}} list 
> and shown as "Dead Servers" in the master UI. This is a valid warn for 
> operators to  notice the self-aborted servers and give a sanity check to 
> avoid further issues. However, after necessary checks, even if operator is 
> sure that the node is decommissioned (such as for repair), there's no way to 
> clear the dead server list except restarting master. See more details in 
> [this 
> discussion|http://mail-archives.apache.org/mod_mbox/hbase-user/201705.mbox/%3CCAM7-19%2BD4MLu2b1R94%2BtWQDspjfny2sCy4Qit8JtCgjvTOZzzg%40mail.gmail.com%3E]
>  in mail list
> Here we propose to add a hbase shell command to allow clearing dead server 
> list in {{ServerManager}} for advanced users, and the command should be 
> executed with caution.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18132) Low replication should be checked in period in case of datanode rolling upgrade

2017-05-30 Thread Allan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-18132:
---
Attachment: HBASE-18132-branch-1.v3.patch

[~tedyu], why Hadoop QA keeps failing? Can you look at it, thanks!

> Low replication should be checked in period in case of datanode rolling 
> upgrade
> ---
>
> Key: HBASE-18132
> URL: https://issues.apache.org/jira/browse/HBASE-18132
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0, 1.1.10
>Reporter: Allan Yang
>Assignee: Allan Yang
> Attachments: HBASE-18132-branch-1.patch, 
> HBASE-18132-branch-1.v2.patch, HBASE-18132-branch-1.v3.patch
>
>
> For now, we just check low replication of WALs when there is a sync operation 
> (HBASE-2234), rolling the log if the replica of the WAL is less than 
> configured. But if the WAL has very little writes or no writes at all, low 
> replication will not be detected and thus no log will be rolled. 
> That is a problem when rolling updating datanode, all replica of the WAL with 
> no writes will be restarted and lead to the WAL file end up with a abnormal 
> state. Later operation of opening this file will be always failed.
> I bring up a patch to check low replication of WALs at a configured period. 
> When rolling updating datanodes, we just make sure the restart interval time 
> between two nodes is bigger than the low replication check time, the WAL will 
> be closed and rolled normally. A UT in the patch will show everything.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18129) truncate_preserve fails when the truncate method doesn't exists on the master

2017-05-30 Thread Guangxu Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030531#comment-16030531
 ] 

Guangxu Cheng commented on HBASE-18129:
---

Update patch as [~yuzhih...@gmail.com] suggestions. If there are no problems, 
the patch for master branch will come soon.Thanks

> truncate_preserve fails when the truncate method doesn't exists on the master
> -
>
> Key: HBASE-18129
> URL: https://issues.apache.org/jira/browse/HBASE-18129
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0, 1.2.5
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
> Attachments: HBASE-18129-branch-1.patch, 
> HBASE-18129-branch-1-v1.patch.patch, HBASE-18129-branch-1-v2.patch, 
> HBASE-18129-master.patch
>
>
> Recently, I runs a rolling upgrade from HBase 0.98.x to HBase 1.2.5. During 
> the master hasn't been upgraded yet, I truncate a table by the command 
> truncate_preserve of 1.2.5, but failed.
> {code}
> hbase(main):001:0> truncate_preserve 'cf_logs'
> Truncating 'cf_logs' table (it may take a while):
>  - Disabling table...
>  - Truncating table...
>  - Dropping table...
>  - Creating table with region boundaries...
> ERROR: no method 'createTable' for arguments 
> (org.apache.hadoop.hbase.HTableDescriptor,org.jruby.java.proxies.ArrayJavaProxy)
>  on Java::OrgApacheHadoopHbaseClient::HBaseAdmin
> {code}
> After checking code and commit history, I found it's HBASE-12833 which causes 
> this bug.so we should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18129) truncate_preserve fails when the truncate method doesn't exists on the master

2017-05-30 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-18129:
---
Release Note: The command truncate_preserve will be fine when the truncate 
method doesn't exist on the master  (was: the command truncate_preserve will be 
fine when the truncate method doesn't exists on the master)

> truncate_preserve fails when the truncate method doesn't exists on the master
> -
>
> Key: HBASE-18129
> URL: https://issues.apache.org/jira/browse/HBASE-18129
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0, 1.2.5
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
> Attachments: HBASE-18129-branch-1.patch, 
> HBASE-18129-branch-1-v1.patch.patch, HBASE-18129-branch-1-v2.patch, 
> HBASE-18129-master.patch
>
>
> Recently, I runs a rolling upgrade from HBase 0.98.x to HBase 1.2.5. During 
> the master hasn't been upgraded yet, I truncate a table by the command 
> truncate_preserve of 1.2.5, but failed.
> {code}
> hbase(main):001:0> truncate_preserve 'cf_logs'
> Truncating 'cf_logs' table (it may take a while):
>  - Disabling table...
>  - Truncating table...
>  - Dropping table...
>  - Creating table with region boundaries...
> ERROR: no method 'createTable' for arguments 
> (org.apache.hadoop.hbase.HTableDescriptor,org.jruby.java.proxies.ArrayJavaProxy)
>  on Java::OrgApacheHadoopHbaseClient::HBaseAdmin
> {code}
> After checking code and commit history, I found it's HBASE-12833 which causes 
> this bug.so we should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18129) truncate_preserve fails when the truncate method doesn't exists on the master

2017-05-30 Thread Guangxu Cheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangxu Cheng updated HBASE-18129:
--
Attachment: HBASE-18129-branch-1-v2.patch

> truncate_preserve fails when the truncate method doesn't exists on the master
> -
>
> Key: HBASE-18129
> URL: https://issues.apache.org/jira/browse/HBASE-18129
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0, 1.2.5
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
> Attachments: HBASE-18129-branch-1.patch, 
> HBASE-18129-branch-1-v1.patch.patch, HBASE-18129-branch-1-v2.patch, 
> HBASE-18129-master.patch
>
>
> Recently, I runs a rolling upgrade from HBase 0.98.x to HBase 1.2.5. During 
> the master hasn't been upgraded yet, I truncate a table by the command 
> truncate_preserve of 1.2.5, but failed.
> {code}
> hbase(main):001:0> truncate_preserve 'cf_logs'
> Truncating 'cf_logs' table (it may take a while):
>  - Disabling table...
>  - Truncating table...
>  - Dropping table...
>  - Creating table with region boundaries...
> ERROR: no method 'createTable' for arguments 
> (org.apache.hadoop.hbase.HTableDescriptor,org.jruby.java.proxies.ArrayJavaProxy)
>  on Java::OrgApacheHadoopHbaseClient::HBaseAdmin
> {code}
> After checking code and commit history, I found it's HBASE-12833 which causes 
> this bug.so we should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-05-30 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030526#comment-16030526
 ] 

Yu Li commented on HBASE-15160:
---

bq. why do you think that updating the metrics should be pulled up the stack?
Since there's a synchronization in {{getMetaBlock}} and up the stack we could 
record the IO time of {{getMetaBlock}} outside the lock. But considering in 
real world meta is cached for most case, it's ok to exclude the IO time of it. 
So this is not an argument but just a clarification of my thought since you 
asked (smile).

I think reasons like making it easier for backport stands, so current patch 
LGTM. Let me check YCSB to make sure no performance regression with it (should 
be since the current patch is quite similar to the one we're running online).

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18132) Low replication should be checked in period in case of datanode rolling upgrade

2017-05-30 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030516#comment-16030516
 ] 

Ted Yu commented on HBASE-18132:


{code}
108 "hbase.regionserver.hlog.checklowreplicationinterval", 30 * 
1000);
{code}
Add dots (word separator) in checklowreplicationinterval so that it is easier 
to read.

> Low replication should be checked in period in case of datanode rolling 
> upgrade
> ---
>
> Key: HBASE-18132
> URL: https://issues.apache.org/jira/browse/HBASE-18132
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0, 1.1.10
>Reporter: Allan Yang
>Assignee: Allan Yang
> Attachments: HBASE-18132-branch-1.patch, HBASE-18132-branch-1.v2.patch
>
>
> For now, we just check low replication of WALs when there is a sync operation 
> (HBASE-2234), rolling the log if the replica of the WAL is less than 
> configured. But if the WAL has very little writes or no writes at all, low 
> replication will not be detected and thus no log will be rolled. 
> That is a problem when rolling updating datanode, all replica of the WAL with 
> no writes will be restarted and lead to the WAL file end up with a abnormal 
> state. Later operation of opening this file will be always failed.
> I bring up a patch to check low replication of WALs at a configured period. 
> When rolling updating datanodes, we just make sure the restart interval time 
> between two nodes is bigger than the low replication check time, the WAL will 
> be closed and rolled normally. A UT in the patch will show everything.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18082) Provide capability for backup / restore client to pace execution

2017-05-30 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-18082:
---
Labels: backup  (was: )

> Provide capability for backup / restore client to pace execution
> 
>
> Key: HBASE-18082
> URL: https://issues.apache.org/jira/browse/HBASE-18082
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>  Labels: backup
>
> Currently the backup / restore is kind of monolithic in that once backup / 
> restore fails, the retry would start from the beginning of the operation.
> Backup / restore client can record the last successful substep of the current 
> operation (in backup table, e.g.) so that subsequent operation can resume 
> from this substep.
> This would allow shorter execution time for the retry operation and give 
> better user experience.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18132) Low replication should be checked in period in case of datanode rolling upgrade

2017-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030504#comment-16030504
 ] 

Hadoop QA commented on HBASE-18132:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 38s 
{color} | {color:red} Docker failed to build yetus/hbase:58c504e. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870490/HBASE-18132-branch-1.v2.patch
 |
| JIRA Issue | HBASE-18132 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7009/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Low replication should be checked in period in case of datanode rolling 
> upgrade
> ---
>
> Key: HBASE-18132
> URL: https://issues.apache.org/jira/browse/HBASE-18132
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0, 1.1.10
>Reporter: Allan Yang
>Assignee: Allan Yang
> Attachments: HBASE-18132-branch-1.patch, HBASE-18132-branch-1.v2.patch
>
>
> For now, we just check low replication of WALs when there is a sync operation 
> (HBASE-2234), rolling the log if the replica of the WAL is less than 
> configured. But if the WAL has very little writes or no writes at all, low 
> replication will not be detected and thus no log will be rolled. 
> That is a problem when rolling updating datanode, all replica of the WAL with 
> no writes will be restarted and lead to the WAL file end up with a abnormal 
> state. Later operation of opening this file will be always failed.
> I bring up a patch to check low replication of WALs at a configured period. 
> When rolling updating datanodes, we just make sure the restart interval time 
> between two nodes is bigger than the low replication check time, the WAL will 
> be closed and rolled normally. A UT in the patch will show everything.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18089) TestScannerHeartbeatMessages fails in branch-1

2017-05-30 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030502#comment-16030502
 ] 

Ted Yu commented on HBASE-18089:


For the 2nd part of testImportanceOfHeartbeats :
{code}
HeartbeatRPCServices.heartbeatsEnabled = false;
try {
  testCallable.call();
} catch (Exception e) {
  return;
{code}
What might have happened where the test failed was that the call() returned 
earlier than an exception could be thrown.

> TestScannerHeartbeatMessages fails in branch-1
> --
>
> Key: HBASE-18089
> URL: https://issues.apache.org/jira/browse/HBASE-18089
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: huaxiang sun
> Attachments: test-heartbeat-6860.out
>
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/6860/artifact/patchprocess/patch-unit-hbase-server.txt
>  :
> {code}
> testScannerHeartbeatMessages(org.apache.hadoop.hbase.regionserver.TestScannerHeartbeatMessages)
>   Time elapsed: 2.376 sec  <<< FAILURE!
> java.lang.AssertionError: Heartbeats messages are disabled, an exception 
> should be thrown. If an exception  is not thrown, the test case is not 
> testing the importance of heartbeat messages
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hadoop.hbase.regionserver.TestScannerHeartbeatMessages.testImportanceOfHeartbeats(TestScannerHeartbeatMessages.java:237)
>   at 
> org.apache.hadoop.hbase.regionserver.TestScannerHeartbeatMessages.testScannerHeartbeatMessages(TestScannerHeartbeatMessages.java:207)
> {code}
> Similar test failure can be observed in 
> https://builds.apache.org/job/PreCommit-HBASE-Build/6852/artifact/patchprocess/patch-unit-hbase-server.txt



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18132) Low replication should be checked in period in case of datanode rolling upgrade

2017-05-30 Thread Allan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-18132:
---
Attachment: HBASE-18132-branch-1.v2.patch

> Low replication should be checked in period in case of datanode rolling 
> upgrade
> ---
>
> Key: HBASE-18132
> URL: https://issues.apache.org/jira/browse/HBASE-18132
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0, 1.1.10
>Reporter: Allan Yang
>Assignee: Allan Yang
> Attachments: HBASE-18132-branch-1.patch, HBASE-18132-branch-1.v2.patch
>
>
> For now, we just check low replication of WALs when there is a sync operation 
> (HBASE-2234), rolling the log if the replica of the WAL is less than 
> configured. But if the WAL has very little writes or no writes at all, low 
> replication will not be detected and thus no log will be rolled. 
> That is a problem when rolling updating datanode, all replica of the WAL with 
> no writes will be restarted and lead to the WAL file end up with a abnormal 
> state. Later operation of opening this file will be always failed.
> I bring up a patch to check low replication of WALs at a configured period. 
> When rolling updating datanodes, we just make sure the restart interval time 
> between two nodes is bigger than the low replication check time, the WAL will 
> be closed and rolled normally. A UT in the patch will show everything.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18128) compaction marker could be skipped

2017-05-30 Thread Jingyun Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030488#comment-16030488
 ] 

Jingyun Tian commented on HBASE-18128:
--

Sure. I'll come up with a fix and we can discuss if it is reasonable.

> compaction marker could be skipped 
> ---
>
> Key: HBASE-18128
> URL: https://issues.apache.org/jira/browse/HBASE-18128
> Project: HBase
>  Issue Type: Improvement
>  Components: Compaction, regionserver
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>
> The sequence for a compaction are as follows:
> 1. Compaction writes new files under region/.tmp directory (compaction output)
> 2. Compaction atomically moves the temporary file under region directory
> 3. Compaction appends a WAL edit containing the compaction input and output 
> files. Forces sync on WAL.
> 4. Compaction deletes the input files from the region directory.
> But if a flush happened between 3 and 4, then the regionserver crushed. The 
> compaction marker will be skipped when splitting log because the sequence id 
> of compaction marker is smaller than lastFlushedSequenceId.
> {code}
> if (lastFlushedSequenceId >= entry.getKey().getLogSeqNum()) {
>   editsSkipped++;
>   continue;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HBASE-18128) compaction marker could be skipped

2017-05-30 Thread Jingyun Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian reassigned HBASE-18128:


Assignee: Jingyun Tian

> compaction marker could be skipped 
> ---
>
> Key: HBASE-18128
> URL: https://issues.apache.org/jira/browse/HBASE-18128
> Project: HBase
>  Issue Type: Improvement
>  Components: Compaction, regionserver
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>
> The sequence for a compaction are as follows:
> 1. Compaction writes new files under region/.tmp directory (compaction output)
> 2. Compaction atomically moves the temporary file under region directory
> 3. Compaction appends a WAL edit containing the compaction input and output 
> files. Forces sync on WAL.
> 4. Compaction deletes the input files from the region directory.
> But if a flush happened between 3 and 4, then the regionserver crushed. The 
> compaction marker will be skipped when splitting log because the sequence id 
> of compaction marker is smaller than lastFlushedSequenceId.
> {code}
> if (lastFlushedSequenceId >= entry.getKey().getLogSeqNum()) {
>   editsSkipped++;
>   continue;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15995) Separate replication WAL reading from shipping

2017-05-30 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030482#comment-16030482
 ] 

Guanghao Zhang commented on HBASE-15995:


[~vincentpoon] would you consider putting up a patch also for branch-1? 
[~tedyu] [~apurtell] This improvement seems not break any compatibility, so 
this should be merged into branch-1, too? Thanks.

> Separate replication WAL reading from shipping
> --
>
> Key: HBASE-15995
> URL: https://issues.apache.org/jira/browse/HBASE-15995
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
> Fix For: 2.0.0
>
> Attachments: HBASE-15995.master.v1.patch, 
> HBASE-15995.master.v2.patch, HBASE-15995.master.v3.patch, 
> HBASE-15995.master.v4.patch, HBASE-15995.master.v6.patch, 
> HBASE-15995.master.v7.patch, replicationV1_100ms_delay.png, 
> replicationV2_100ms_delay.png
>
>
> Currently ReplicationSource reads edits from the WAL and ships them in the 
> same thread.
> By breaking out the reading from the shipping, we can introduce greater 
> parallelism and lay the foundation for further refactoring to a pipelined, 
> streaming model.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HBASE-18132) Low replication should be checked in period in case of datanode rolling upgrade

2017-05-30 Thread Allan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030481#comment-16030481
 ] 

Allan Yang edited comment on HBASE-18132 at 5/31/17 1:35 AM:
-

{quote}
How is the default value of 30 seconds determined ?
{quote}
It doesn't matter,  the only requirement is that the interval of checking low 
replication is smaller than the interval of restarting datanodes. In our case, 
we set the restart interval of DN when rolling upgrade to 1 min. So we set the 
check interval to 30 seconds.
Thanks for your advice, [~tedyu]. I will modify the patch and upload a master 
patch later


was (Author: allan163):
{quote}
How is the default value of 30 seconds determined ?
{quote}
It doesn't matter,  the only requirement is that the interval of checking low 
replication is smaller than the interval of restarting datanodes. In our case, 
we set the restart interval of DN at rolling start to 1 min. So we set the 
check interval to 30 seconds.
Thanks for your advice, [~tedyu]. I will modify the patch and upload a master 
patch later

> Low replication should be checked in period in case of datanode rolling 
> upgrade
> ---
>
> Key: HBASE-18132
> URL: https://issues.apache.org/jira/browse/HBASE-18132
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0, 1.1.10
>Reporter: Allan Yang
>Assignee: Allan Yang
> Attachments: HBASE-18132-branch-1.patch
>
>
> For now, we just check low replication of WALs when there is a sync operation 
> (HBASE-2234), rolling the log if the replica of the WAL is less than 
> configured. But if the WAL has very little writes or no writes at all, low 
> replication will not be detected and thus no log will be rolled. 
> That is a problem when rolling updating datanode, all replica of the WAL with 
> no writes will be restarted and lead to the WAL file end up with a abnormal 
> state. Later operation of opening this file will be always failed.
> I bring up a patch to check low replication of WALs at a configured period. 
> When rolling updating datanodes, we just make sure the restart interval time 
> between two nodes is bigger than the low replication check time, the WAL will 
> be closed and rolled normally. A UT in the patch will show everything.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18132) Low replication should be checked in period in case of datanode rolling upgrade

2017-05-30 Thread Allan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030481#comment-16030481
 ] 

Allan Yang commented on HBASE-18132:


{quote}
How is the default value of 30 seconds determined ?
{quote}
It doesn't matter,  the only requirement is that the interval of checking low 
replication is smaller than the interval of restarting datanodes. In our case, 
we set the restart interval of DN at rolling start to 1 min. So we set the 
check interval to 30 seconds.
Thanks for your advice, [~tedyu]. I will modify the patch and upload a master 
patch later

> Low replication should be checked in period in case of datanode rolling 
> upgrade
> ---
>
> Key: HBASE-18132
> URL: https://issues.apache.org/jira/browse/HBASE-18132
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0, 1.1.10
>Reporter: Allan Yang
>Assignee: Allan Yang
> Attachments: HBASE-18132-branch-1.patch
>
>
> For now, we just check low replication of WALs when there is a sync operation 
> (HBASE-2234), rolling the log if the replica of the WAL is less than 
> configured. But if the WAL has very little writes or no writes at all, low 
> replication will not be detected and thus no log will be rolled. 
> That is a problem when rolling updating datanode, all replica of the WAL with 
> no writes will be restarted and lead to the WAL file end up with a abnormal 
> state. Later operation of opening this file will be always failed.
> I bring up a patch to check low replication of WALs at a configured period. 
> When rolling updating datanodes, we just make sure the restart interval time 
> between two nodes is bigger than the low replication check time, the WAL will 
> be closed and rolled normally. A UT in the patch will show everything.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HBASE-15903) Delete Object

2017-05-30 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030472#comment-16030472
 ] 

Ted Yu edited comment on HBASE-15903 at 5/31/17 1:34 AM:
-

Patch v7 adds delete-test and enhances PutGetDelete test to cover various 
delete class methods.


was (Author: yuzhih...@gmail.com):
Patch v7 adds delete-test and enhances PutGetDelete test to cover various 
delete classes.

> Delete Object
> -
>
> Key: HBASE-15903
> URL: https://issues.apache.org/jira/browse/HBASE-15903
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sudeep Sunthankar
>Assignee: Ted Yu
> Attachments: 15903.v2.txt, 15903.v4.txt, 15903.v7.txt, 
> HBASE-15903.HBASE-14850.v1.patch
>
>
> Patch for creating Delete objects. These Delete objects are used by the Table 
> implementation to delete rowkey from a table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-18137) Replication gets stuck for empty WALs

2017-05-30 Thread Ashu Pachauri (JIRA)
Ashu Pachauri created HBASE-18137:
-

 Summary: Replication gets stuck for empty WALs
 Key: HBASE-18137
 URL: https://issues.apache.org/jira/browse/HBASE-18137
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 1.3.1
Reporter: Ashu Pachauri


Replication assumes that only the last WAL of a recovered queue can be empty. 
But, intermittent DFS issues may cause empty WALs being created (without the 
PWAL magic), and a roll of WAL to happen without a regionserver crash. This 
will cause recovered queues to have empty WALs in the middle. This cause 
replication to get stuck:
{code}
TRACE regionserver.ReplicationSource: Opening log 
WARN regionserver.ReplicationSource: - Got: 
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1915)
at 
org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1880)
at 
org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1829)
at 
org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1843)
at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:70)
at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.reset(SequenceFileLogReader.java:168)
at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.initReader(SequenceFileLogReader.java:177)
at 
org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:66)
at 
org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:312)
at 
org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:276)
at 
org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:264)
at 
org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:423)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationWALReaderManager.openReader(ReplicationWALReaderManager.java:70)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource$ReplicationSourceWorkerThread.openReader(ReplicationSource.java:830)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource$ReplicationSourceWorkerThread.run(ReplicationSource.java:572)
{code}

The WAL in question was completely empty but there were other WALs in the 
recovered queue which were newer and non-empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14614) Procedure v2: Core Assignment Manager

2017-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030474#comment-16030474
 ] 

Hadoop QA commented on HBASE-14614:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 102 new or modified 
test files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 4s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 12m 29s 
{color} | {color:red} root in master failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 13s 
{color} | {color:red} hbase-hadoop-compat in master failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 7m 
13s {color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red} 0m 20s 
{color} | {color:red} hbase-server in master failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 19s 
{color} | {color:red} hbase-client in master failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 8m 2s 
{color} | {color:red} hbase-protocol-shaded in master has 24 extant Findbugs 
warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 33s 
{color} | {color:red} hbase-rsgroup in master failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 14m 3s 
{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 3m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 
6s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
32s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 1s 
{color} | {color:red} The patch has 1056 line(s) that end in whitespace. Use 
git apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 38s 
{color} | {color:red} The patch 600 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
33m 28s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 2m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 40s 
{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 33s 
{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 37s 
{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 24s 
{color} | {color:green} hbase-hadoop-compat in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 33s 
{color} | {color:green} hbase-hadoop2-compat in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 51s 
{color} | {color:green} 

[jira] [Updated] (HBASE-15903) Delete Object

2017-05-30 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-15903:
---
Attachment: 15903.v7.txt

Patch v7 adds delete-test and enhances PutGetDelete test to cover various 
delete classes.

> Delete Object
> -
>
> Key: HBASE-15903
> URL: https://issues.apache.org/jira/browse/HBASE-15903
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sudeep Sunthankar
>Assignee: Ted Yu
> Attachments: 15903.v2.txt, 15903.v4.txt, 15903.v7.txt, 
> HBASE-15903.HBASE-14850.v1.patch
>
>
> Patch for creating Delete objects. These Delete objects are used by the Table 
> implementation to delete rowkey from a table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18027) Replication should respect RPC size limits when batching edits

2017-05-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030460#comment-16030460
 ] 

Hudson commented on HBASE-18027:


FAILURE: Integrated in Jenkins build HBase-1.4 #752 (See 
[https://builds.apache.org/job/HBase-1.4/752/])
HBASE-18027 HBaseInterClusterReplicationEndpoint should respect RPC (apurtell: 
rev 140c559a3a2528bc1485852bb9b5251901d58798)
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/HBaseInterClusterReplicationEndpoint.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicator.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALKey.java


> Replication should respect RPC size limits when batching edits
> --
>
> Key: HBASE-18027
> URL: https://issues.apache.org/jira/browse/HBASE-18027
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.4.0, 1.3.1
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-18027-branch-1.patch, HBASE-18027-branch-1.patch, 
> HBASE-18027-branch-1.patch, HBASE-18027-branch-1.patch, HBASE-18027.patch, 
> HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch, 
> HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch
>
>
> In HBaseInterClusterReplicationEndpoint#replicate we try to replicate in 
> batches. We create N lists. N is the minimum of configured replicator 
> threads, number of 100-waledit batches, or number of current sinks. Every 
> pending entry in the replication context is then placed in order by hash of 
> encoded region name into one of these N lists. Each of the N lists is then 
> sent all at once in one replication RPC. We do not test if the sum of data in 
> each N list will exceed RPC size limits. This code presumes each individual 
> edit is reasonably small. Not checking for aggregate size while assembling 
> the lists into RPCs is an oversight and can lead to replication failure when 
> that assumption is violated.
> We can fix this by generating as many replication RPC calls as we need to 
> drain a list, keeping each RPC under limit, instead of assuming the whole 
> list will fit in one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18129) truncate_preserve fails when the truncate method doesn't exists on the master

2017-05-30 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030452#comment-16030452
 ] 

Ted Yu commented on HBASE-18129:


{code}
559 class UnSupportMethodException < StandardError
{code}
UnSupportMethodException -> UnsupportedMethodException

The new exception sounds general. However, the message is about truncateTable.
Is it possible to pass the cause to the exception ?

> truncate_preserve fails when the truncate method doesn't exists on the master
> -
>
> Key: HBASE-18129
> URL: https://issues.apache.org/jira/browse/HBASE-18129
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0, 1.2.5
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
> Attachments: HBASE-18129-branch-1.patch, 
> HBASE-18129-branch-1-v1.patch.patch, HBASE-18129-master.patch
>
>
> Recently, I runs a rolling upgrade from HBase 0.98.x to HBase 1.2.5. During 
> the master hasn't been upgraded yet, I truncate a table by the command 
> truncate_preserve of 1.2.5, but failed.
> {code}
> hbase(main):001:0> truncate_preserve 'cf_logs'
> Truncating 'cf_logs' table (it may take a while):
>  - Disabling table...
>  - Truncating table...
>  - Dropping table...
>  - Creating table with region boundaries...
> ERROR: no method 'createTable' for arguments 
> (org.apache.hadoop.hbase.HTableDescriptor,org.jruby.java.proxies.ArrayJavaProxy)
>  on Java::OrgApacheHadoopHbaseClient::HBaseAdmin
> {code}
> After checking code and commit history, I found it's HBASE-12833 which causes 
> this bug.so we should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-16392) Backup delete fault tolerance

2017-05-30 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030447#comment-16030447
 ] 

Ted Yu commented on HBASE-16392:


I ran TestRepairAfterFailedDelete and searched for "backup repair" / "repair 
tool" in the test output - no hit.

Mind telling me which message should be covered by the test ?

Please check grammatical error in the log messages.

> Backup delete fault tolerance
> -
>
> Key: HBASE-16392
> URL: https://issues.apache.org/jira/browse/HBASE-16392
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: HBASE-16392-v1.patch, HBASE-16392-v2.patch
>
>
> Backup delete modified file system and backup system table. We have to make 
> sure that operation is atomic, durable and isolated.
> Delete operation:
> # Start backup session (this guarantees) that system will be blocked for all 
> backup commands during delete operation
> # Save list of tables being deleted to system table
> # Before delete operation we take backup system table snapshot  
> # During delete operation we detect any failures and restore backup system 
> table from snapshot, then finish backup session
> # To guarantee consistency of the data, delete operation MUST be repeated
> # We guarantee that all file delete operations are idempotent, can be 
> repeated multiple times
> # Any backup operations will be blocked until consistency is restored
> # To restore consistency, repair command must be executed.
> # Repair command checks if there is failed delete op in a backup system 
> table, and repeats delete operation



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18129) truncate_preserve fails when the truncate method doesn't exists on the master

2017-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030446#comment-16030446
 ] 

Hadoop QA commented on HBASE-18129:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 1s {color} 
| {color:red} Docker failed to build yetus/hbase:757bf37. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870481/HBASE-18129-master.patch
 |
| JIRA Issue | HBASE-18129 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7008/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> truncate_preserve fails when the truncate method doesn't exists on the master
> -
>
> Key: HBASE-18129
> URL: https://issues.apache.org/jira/browse/HBASE-18129
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0, 1.2.5
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
> Attachments: HBASE-18129-branch-1.patch, 
> HBASE-18129-branch-1-v1.patch.patch, HBASE-18129-master.patch
>
>
> Recently, I runs a rolling upgrade from HBase 0.98.x to HBase 1.2.5. During 
> the master hasn't been upgraded yet, I truncate a table by the command 
> truncate_preserve of 1.2.5, but failed.
> {code}
> hbase(main):001:0> truncate_preserve 'cf_logs'
> Truncating 'cf_logs' table (it may take a while):
>  - Disabling table...
>  - Truncating table...
>  - Dropping table...
>  - Creating table with region boundaries...
> ERROR: no method 'createTable' for arguments 
> (org.apache.hadoop.hbase.HTableDescriptor,org.jruby.java.proxies.ArrayJavaProxy)
>  on Java::OrgApacheHadoopHbaseClient::HBaseAdmin
> {code}
> After checking code and commit history, I found it's HBASE-12833 which causes 
> this bug.so we should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18129) truncate_preserve fails when the truncate method doesn't exists on the master

2017-05-30 Thread Guangxu Cheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangxu Cheng updated HBASE-18129:
--
Attachment: HBASE-18129-master.patch

> truncate_preserve fails when the truncate method doesn't exists on the master
> -
>
> Key: HBASE-18129
> URL: https://issues.apache.org/jira/browse/HBASE-18129
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0, 1.2.5
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
> Attachments: HBASE-18129-branch-1.patch, 
> HBASE-18129-branch-1-v1.patch.patch, HBASE-18129-master.patch
>
>
> Recently, I runs a rolling upgrade from HBase 0.98.x to HBase 1.2.5. During 
> the master hasn't been upgraded yet, I truncate a table by the command 
> truncate_preserve of 1.2.5, but failed.
> {code}
> hbase(main):001:0> truncate_preserve 'cf_logs'
> Truncating 'cf_logs' table (it may take a while):
>  - Disabling table...
>  - Truncating table...
>  - Dropping table...
>  - Creating table with region boundaries...
> ERROR: no method 'createTable' for arguments 
> (org.apache.hadoop.hbase.HTableDescriptor,org.jruby.java.proxies.ArrayJavaProxy)
>  on Java::OrgApacheHadoopHbaseClient::HBaseAdmin
> {code}
> After checking code and commit history, I found it's HBASE-12833 which causes 
> this bug.so we should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18054) log when we add/remove failed servers in client

2017-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030432#comment-16030432
 ] 

Hadoop QA commented on HBASE-18054:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 43s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
48s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
26s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
24s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
48s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
9s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
28m 15s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 33s 
{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
8s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 43m 21s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.13.1 Server=1.13.1 Image:yetus/hbase:757bf37 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870470/HBASE-18054.patch |
| JIRA Issue | HBASE-18054 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 9ff8409757b3 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / d547feac |
| Default Java | 1.8.0_131 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7007/testReport/ |
| modules | C: hbase-client U: hbase-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7007/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> log when we add/remove failed servers in client
> ---
>
> Key: HBASE-18054
> URL: https://issues.apache.org/jira/browse/HBASE-18054
> Project: HBase
>  Issue Type: Bug
>  Components: Client, Operability
>Affects Versions: 1.3.0
>Reporter: Sean Busbey

[jira] [Commented] (HBASE-16148) Hybrid Logical Clocks(placeholder for running tests)

2017-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030426#comment-16030426
 ] 

Hadoop QA commented on HBASE-16148:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 19m 6s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 28 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 59s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 
48s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 45s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
34s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
2s {color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 57s 
{color} | {color:red} hbase-protocol-shaded in master has 24 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s 
{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
32s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 540 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
26m 10s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:red}-1{color} | {color:red} hbaseprotoc {color} | {color:red} 0m 10s 
{color} | {color:red} hbase-protocol-shaded in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 8s 
{color} | {color:red} hbase-protocol-shaded in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 48s 
{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 
total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 56s 
{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 17s 
{color} | {color:green} hbase-protocol in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 8s {color} | 
{color:red} hbase-protocol-shaded in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 18s 
{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 55s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
48s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 166m 31s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hbase-server |
|  |  Dead store to pt in 

[jira] [Commented] (HBASE-18095) Provide an option for clients to find the server hosting META that does not involve the ZooKeeper client

2017-05-30 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030419#comment-16030419
 ] 

Andrew Purtell commented on HBASE-18095:


bq. Would this still work if SPNEGO is enabled?  If not, I'm not sure what else 
we can do short of a totally new endpoint.
Will have to check. If not then agreed we will need a totally new endpoint. 

> Provide an option for clients to find the server hosting META that does not 
> involve the ZooKeeper client
> 
>
> Key: HBASE-18095
> URL: https://issues.apache.org/jira/browse/HBASE-18095
> Project: HBase
>  Issue Type: New Feature
>  Components: Client
>Reporter: Andrew Purtell
>
> Clients are required to connect to ZooKeeper to find the location of the 
> regionserver hosting the meta table region. Site configuration provides the 
> client a list of ZK quorum peers and the client uses an embedded ZK client to 
> query meta location. Timeouts and retry behavior of this embedded ZK client 
> are managed orthogonally to HBase layer settings and in some cases the ZK 
> cannot manage what in theory the HBase client can, i.e. fail fast upon outage 
> or network partition.
> We should consider new configuration settings that provide a list of 
> well-known master and backup master locations, and with this information the 
> client can contact any of the master processes directly. Any master in either 
> active or passive state will track meta location and respond to requests for 
> it with its cached last known location. If this location is stale, the client 
> can ask again with a flag set that requests the master refresh its location 
> cache and return the up-to-date location. Every client interaction with the 
> cluster thus uses only HBase RPC as transport, with appropriate settings 
> applied to the connection. The configuration toggle that enables this 
> alternative meta location lookup should be false by default.
> This removes the requirement that HBase clients embed the ZK client and 
> contact the ZK service directly at the beginning of the connection lifecycle. 
> This has several benefits. ZK service need not be exposed to clients, and 
> their potential abuse, yet no benefit ZK provides the HBase server cluster is 
> compromised. Normalizing HBase client and ZK client timeout settings and 
> retry behavior - in some cases, impossible, i.e. for fail-fast - is no longer 
> necessary. 
> And, from [~ghelmling]: There is an additional complication here for 
> token-based authentication. When a delegation token is used for SASL 
> authentication, the client uses the cluster ID obtained from Zookeeper to 
> select the token identifier to use. So there would also need to be some 
> Zookeeper-less, unauthenticated way to obtain the cluster ID as well. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18027) Replication should respect RPC size limits when batching edits

2017-05-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030416#comment-16030416
 ] 

Hudson commented on HBASE-18027:


FAILURE: Integrated in Jenkins build HBase-HBASE-14614 #254 (See 
[https://builds.apache.org/job/HBase-HBASE-14614/254/])
HBASE-18027 HBaseInterClusterReplicationEndpoint should respect RPC (apurtell: 
rev d547feac6b673d59703e1a0ef46db38b26046e4c)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/HBaseInterClusterReplicationEndpoint.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALKey.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicator.java


> Replication should respect RPC size limits when batching edits
> --
>
> Key: HBASE-18027
> URL: https://issues.apache.org/jira/browse/HBASE-18027
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.4.0, 1.3.1
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-18027-branch-1.patch, HBASE-18027-branch-1.patch, 
> HBASE-18027-branch-1.patch, HBASE-18027-branch-1.patch, HBASE-18027.patch, 
> HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch, 
> HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch
>
>
> In HBaseInterClusterReplicationEndpoint#replicate we try to replicate in 
> batches. We create N lists. N is the minimum of configured replicator 
> threads, number of 100-waledit batches, or number of current sinks. Every 
> pending entry in the replication context is then placed in order by hash of 
> encoded region name into one of these N lists. Each of the N lists is then 
> sent all at once in one replication RPC. We do not test if the sum of data in 
> each N list will exceed RPC size limits. This code presumes each individual 
> edit is reasonably small. Not checking for aggregate size while assembling 
> the lists into RPCs is an oversight and can lead to replication failure when 
> that assumption is violated.
> We can fix this by generating as many replication RPC calls as we need to 
> drain a list, keeping each RPC under limit, instead of assuming the whole 
> list will fit in one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18095) Provide an option for clients to find the server hosting META that does not involve the ZooKeeper client

2017-05-30 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030415#comment-16030415
 ] 

Gary Helmling commented on HBASE-18095:
---

bq. Cluster ID lookup is most easily accomplished with a new servlet on the 
HTTP(S) endpoint on the masters, serving the cluster ID as plain text

Seems like the best option.  Would this still work if SPNEGO is enabled?  If 
not, I'm not sure what else we can do short of a totally new endpoint.

> Provide an option for clients to find the server hosting META that does not 
> involve the ZooKeeper client
> 
>
> Key: HBASE-18095
> URL: https://issues.apache.org/jira/browse/HBASE-18095
> Project: HBase
>  Issue Type: New Feature
>  Components: Client
>Reporter: Andrew Purtell
>
> Clients are required to connect to ZooKeeper to find the location of the 
> regionserver hosting the meta table region. Site configuration provides the 
> client a list of ZK quorum peers and the client uses an embedded ZK client to 
> query meta location. Timeouts and retry behavior of this embedded ZK client 
> are managed orthogonally to HBase layer settings and in some cases the ZK 
> cannot manage what in theory the HBase client can, i.e. fail fast upon outage 
> or network partition.
> We should consider new configuration settings that provide a list of 
> well-known master and backup master locations, and with this information the 
> client can contact any of the master processes directly. Any master in either 
> active or passive state will track meta location and respond to requests for 
> it with its cached last known location. If this location is stale, the client 
> can ask again with a flag set that requests the master refresh its location 
> cache and return the up-to-date location. Every client interaction with the 
> cluster thus uses only HBase RPC as transport, with appropriate settings 
> applied to the connection. The configuration toggle that enables this 
> alternative meta location lookup should be false by default.
> This removes the requirement that HBase clients embed the ZK client and 
> contact the ZK service directly at the beginning of the connection lifecycle. 
> This has several benefits. ZK service need not be exposed to clients, and 
> their potential abuse, yet no benefit ZK provides the HBase server cluster is 
> compromised. Normalizing HBase client and ZK client timeout settings and 
> retry behavior - in some cases, impossible, i.e. for fail-fast - is no longer 
> necessary. 
> And, from [~ghelmling]: There is an additional complication here for 
> token-based authentication. When a delegation token is used for SASL 
> authentication, the client uses the cluster ID obtained from Zookeeper to 
> select the token identifier to use. So there would also need to be some 
> Zookeeper-less, unauthenticated way to obtain the cluster ID as well. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14614) Procedure v2: Core Assignment Manager

2017-05-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030417#comment-16030417
 ] 

Hudson commented on HBASE-14614:


FAILURE: Integrated in Jenkins build HBase-HBASE-14614 #254 (See 
[https://builds.apache.org/job/HBase-HBASE-14614/254/])
HBASE-14614 Procedure v2 - Core Assignment Manager (Matteo Bertozzi) (stack: 
rev 8ffd083efbfdbce0ce91471c229462bf6185b903)
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/MockMasterServices.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/RegionLocationFinder.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestFavoredStochasticBalancerPickers.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestModifyTableProcedure.java
* (edit) hbase-protocol-shaded/src/main/protobuf/RegionServerStatus.proto
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestRestoreSnapshotProcedure.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaTableLocator.java
* (add) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/procedure/SimpleMasterProcedureManager.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizerOnCluster.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestTableFavoredNodes.java
* (add) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AbstractStateMachineRegionProcedure.java
* (edit) 
hbase-rsgroup/src/test/java/org/apache/hadoop/hbase/rsgroup/TestRSGroups.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterStatusServlet.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/NettyRpcDuplexHandler.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionPlan.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestRestartCluster.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestCreateNamespaceProcedure.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAsyncRegionAdminApi.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* (edit) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureInMemoryChore.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsckReplicas.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestEnableTableProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/EnableTableProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterWalManager.java
* (edit) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/store/NoopProcedureStore.java
* (edit) 
hbase-protocol-shaded/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/generated/RegionServerStatusProtos.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSplitThread.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/RegionStateListener.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RestoreSnapshotProcedure.java
* (add) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/RemoteProcedureDispatcher.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionFileSystem.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockNoopMasterServices.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/TableNamespaceManager.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/TruncateTableProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestCreateTableProcedure.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestMasterProcedureEvents.java
* (edit) 
hbase-rsgroup/src/test/java/org/apache/hadoop/hbase/rsgroup/TestRSGroupsOfflineMode.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestEnableTable.java
* (edit) 

[jira] [Comment Edited] (HBASE-18095) Provide an option for clients to find the server hosting META that does not involve the ZooKeeper client

2017-05-30 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030398#comment-16030398
 ] 

Andrew Purtell edited comment on HBASE-18095 at 5/31/17 12:05 AM:
--

bq. There is an additional complication here for token-based authentication. 
When a delegation token is used for SASL authentication, the client uses the 
cluster ID obtained from Zookeeper to select the token identifier to use. So 
there would also need to be some Zookeeper-less, unauthenticated way to obtain 
the cluster ID as well.

[~ghelmling] Cluster ID lookup is most easily accomplished with a new servlet 
on the HTTP(S) endpoint on the masters, serving the cluster ID as plain text. 
It can't share the RPC server endpoint when SASL is enabled because any 
interaction with that endpoint must be authenticated. This is ugly but 
alternatives seem worse. One alternative would be a second RPC port for APIs 
that do not / cannot require prior authentication. 


was (Author: apurtell):
bq. There is an additional complication here for token-based authentication. 
When a delegation token is used for SASL authentication, the client uses the 
cluster ID obtained from Zookeeper to select the token identifier to use. So 
there would also need to be some Zookeeper-less, unauthenticated way to obtain 
the cluster ID as well.

[~ghelmling] Cluster ID lookup is most easily accomplished with a new servlet 
on the HTTP(S) endpoint, serving the cluster ID as plain text. It can't share 
the RPC server endpoint when SASL is enabled because any interaction with that 
endpoint must be authenticated. This is ugly but alternatives seem worse. One 
alternative would be a second RPC port for APIs that do not / cannot require 
prior authentication. 

> Provide an option for clients to find the server hosting META that does not 
> involve the ZooKeeper client
> 
>
> Key: HBASE-18095
> URL: https://issues.apache.org/jira/browse/HBASE-18095
> Project: HBase
>  Issue Type: New Feature
>  Components: Client
>Reporter: Andrew Purtell
>
> Clients are required to connect to ZooKeeper to find the location of the 
> regionserver hosting the meta table region. Site configuration provides the 
> client a list of ZK quorum peers and the client uses an embedded ZK client to 
> query meta location. Timeouts and retry behavior of this embedded ZK client 
> are managed orthogonally to HBase layer settings and in some cases the ZK 
> cannot manage what in theory the HBase client can, i.e. fail fast upon outage 
> or network partition.
> We should consider new configuration settings that provide a list of 
> well-known master and backup master locations, and with this information the 
> client can contact any of the master processes directly. Any master in either 
> active or passive state will track meta location and respond to requests for 
> it with its cached last known location. If this location is stale, the client 
> can ask again with a flag set that requests the master refresh its location 
> cache and return the up-to-date location. Every client interaction with the 
> cluster thus uses only HBase RPC as transport, with appropriate settings 
> applied to the connection. The configuration toggle that enables this 
> alternative meta location lookup should be false by default.
> This removes the requirement that HBase clients embed the ZK client and 
> contact the ZK service directly at the beginning of the connection lifecycle. 
> This has several benefits. ZK service need not be exposed to clients, and 
> their potential abuse, yet no benefit ZK provides the HBase server cluster is 
> compromised. Normalizing HBase client and ZK client timeout settings and 
> retry behavior - in some cases, impossible, i.e. for fail-fast - is no longer 
> necessary. 
> And, from [~ghelmling]: There is an additional complication here for 
> token-based authentication. When a delegation token is used for SASL 
> authentication, the client uses the cluster ID obtained from Zookeeper to 
> select the token identifier to use. So there would also need to be some 
> Zookeeper-less, unauthenticated way to obtain the cluster ID as well. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18095) Provide an option for clients to find the server hosting META that does not involve the ZooKeeper client

2017-05-30 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030398#comment-16030398
 ] 

Andrew Purtell commented on HBASE-18095:


bq. There is an additional complication here for token-based authentication. 
When a delegation token is used for SASL authentication, the client uses the 
cluster ID obtained from Zookeeper to select the token identifier to use. So 
there would also need to be some Zookeeper-less, unauthenticated way to obtain 
the cluster ID as well.

[~ghelmling] Cluster ID lookup is most easily accomplished with a new servlet 
on the HTTP(S) endpoint, serving the cluster ID as plain text. It can't share 
the RPC server endpoint when SASL is enabled because any interaction with that 
endpoint must be authenticated. This is ugly but alternatives seem worse. One 
alternative would be a second RPC port for APIs that do not / cannot require 
prior authentication. 

> Provide an option for clients to find the server hosting META that does not 
> involve the ZooKeeper client
> 
>
> Key: HBASE-18095
> URL: https://issues.apache.org/jira/browse/HBASE-18095
> Project: HBase
>  Issue Type: New Feature
>  Components: Client
>Reporter: Andrew Purtell
>
> Clients are required to connect to ZooKeeper to find the location of the 
> regionserver hosting the meta table region. Site configuration provides the 
> client a list of ZK quorum peers and the client uses an embedded ZK client to 
> query meta location. Timeouts and retry behavior of this embedded ZK client 
> are managed orthogonally to HBase layer settings and in some cases the ZK 
> cannot manage what in theory the HBase client can, i.e. fail fast upon outage 
> or network partition.
> We should consider new configuration settings that provide a list of 
> well-known master and backup master locations, and with this information the 
> client can contact any of the master processes directly. Any master in either 
> active or passive state will track meta location and respond to requests for 
> it with its cached last known location. If this location is stale, the client 
> can ask again with a flag set that requests the master refresh its location 
> cache and return the up-to-date location. Every client interaction with the 
> cluster thus uses only HBase RPC as transport, with appropriate settings 
> applied to the connection. The configuration toggle that enables this 
> alternative meta location lookup should be false by default.
> This removes the requirement that HBase clients embed the ZK client and 
> contact the ZK service directly at the beginning of the connection lifecycle. 
> This has several benefits. ZK service need not be exposed to clients, and 
> their potential abuse, yet no benefit ZK provides the HBase server cluster is 
> compromised. Normalizing HBase client and ZK client timeout settings and 
> retry behavior - in some cases, impossible, i.e. for fail-fast - is no longer 
> necessary. 
> And, from [~ghelmling]: There is an additional complication here for 
> token-based authentication. When a delegation token is used for SASL 
> authentication, the client uses the cluster ID obtained from Zookeeper to 
> select the token identifier to use. So there would also need to be some 
> Zookeeper-less, unauthenticated way to obtain the cluster ID as well. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Issue Comment Deleted] (HBASE-18054) log when we add/remove failed servers in client

2017-05-30 Thread Ali (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ali updated HBASE-18054:

Comment: was deleted

(was: yes it works now. thank you)

> log when we add/remove failed servers in client
> ---
>
> Key: HBASE-18054
> URL: https://issues.apache.org/jira/browse/HBASE-18054
> Project: HBase
>  Issue Type: Bug
>  Components: Client, Operability
>Affects Versions: 1.3.0
>Reporter: Sean Busbey
>Assignee: Ali
> Attachments: HBASE-18054.patch
>
>
> Currently we log if a server is in the failed server list when we go to 
> connect to it, but we don't log anything about when the server got into the 
> list.
> This means we have to search the log for errors involving the same server 
> name that (hopefully) managed to get into the log within 
> {{FAILED_SERVER_EXPIRY_KEY}} milliseconds earlier (default 2 seconds).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18054) log when we add/remove failed servers in client

2017-05-30 Thread Ali (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030375#comment-16030375
 ] 

Ali commented on HBASE-18054:
-

yes it works now. thank you

> log when we add/remove failed servers in client
> ---
>
> Key: HBASE-18054
> URL: https://issues.apache.org/jira/browse/HBASE-18054
> Project: HBase
>  Issue Type: Bug
>  Components: Client, Operability
>Affects Versions: 1.3.0
>Reporter: Sean Busbey
>Assignee: Ali
> Attachments: HBASE-18054.patch
>
>
> Currently we log if a server is in the failed server list when we go to 
> connect to it, but we don't log anything about when the server got into the 
> list.
> This means we have to search the log for errors involving the same server 
> name that (hopefully) managed to get into the log within 
> {{FAILED_SERVER_EXPIRY_KEY}} milliseconds earlier (default 2 seconds).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18054) log when we add/remove failed servers in client

2017-05-30 Thread Ali (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ali updated HBASE-18054:

Affects Version/s: (was: 1.4.0)
   (was: 1.2.0)
   (was: 1.1.0)
   (was: 2.0.0)
   Status: Patch Available  (was: Open)

Added log when a failed server is added to the list of failed servers.

> log when we add/remove failed servers in client
> ---
>
> Key: HBASE-18054
> URL: https://issues.apache.org/jira/browse/HBASE-18054
> Project: HBase
>  Issue Type: Bug
>  Components: Client, Operability
>Affects Versions: 1.3.0
>Reporter: Sean Busbey
>Assignee: Ali
> Attachments: HBASE-18054.patch
>
>
> Currently we log if a server is in the failed server list when we go to 
> connect to it, but we don't log anything about when the server got into the 
> list.
> This means we have to search the log for errors involving the same server 
> name that (hopefully) managed to get into the log within 
> {{FAILED_SERVER_EXPIRY_KEY}} milliseconds earlier (default 2 seconds).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18054) log when we add/remove failed servers in client

2017-05-30 Thread Ali (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ali updated HBASE-18054:

Attachment: HBASE-18054.patch

> log when we add/remove failed servers in client
> ---
>
> Key: HBASE-18054
> URL: https://issues.apache.org/jira/browse/HBASE-18054
> Project: HBase
>  Issue Type: Bug
>  Components: Client, Operability
>Affects Versions: 2.0.0, 1.1.0, 1.2.0, 1.3.0, 1.4.0
>Reporter: Sean Busbey
>Assignee: Ali
> Attachments: HBASE-18054.patch
>
>
> Currently we log if a server is in the failed server list when we go to 
> connect to it, but we don't log anything about when the server got into the 
> list.
> This means we have to search the log for errors involving the same server 
> name that (hopefully) managed to get into the log within 
> {{FAILED_SERVER_EXPIRY_KEY}} milliseconds earlier (default 2 seconds).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18054) log when we add/remove failed servers in client

2017-05-30 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030370#comment-16030370
 ] 

Andrew Purtell commented on HBASE-18054:


[~aky] Try now please

> log when we add/remove failed servers in client
> ---
>
> Key: HBASE-18054
> URL: https://issues.apache.org/jira/browse/HBASE-18054
> Project: HBase
>  Issue Type: Bug
>  Components: Client, Operability
>Affects Versions: 2.0.0, 1.1.0, 1.2.0, 1.3.0, 1.4.0
>Reporter: Sean Busbey
>Assignee: Ali
>
> Currently we log if a server is in the failed server list when we go to 
> connect to it, but we don't log anything about when the server got into the 
> list.
> This means we have to search the log for errors involving the same server 
> name that (hopefully) managed to get into the log within 
> {{FAILED_SERVER_EXPIRY_KEY}} milliseconds earlier (default 2 seconds).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HBASE-18054) log when we add/remove failed servers in client

2017-05-30 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell reassigned HBASE-18054:
--

Assignee: Ali

> log when we add/remove failed servers in client
> ---
>
> Key: HBASE-18054
> URL: https://issues.apache.org/jira/browse/HBASE-18054
> Project: HBase
>  Issue Type: Bug
>  Components: Client, Operability
>Affects Versions: 2.0.0, 1.1.0, 1.2.0, 1.3.0, 1.4.0
>Reporter: Sean Busbey
>Assignee: Ali
>
> Currently we log if a server is in the failed server list when we go to 
> connect to it, but we don't log anything about when the server got into the 
> list.
> This means we have to search the log for errors involving the same server 
> name that (hopefully) managed to get into the log within 
> {{FAILED_SERVER_EXPIRY_KEY}} milliseconds earlier (default 2 seconds).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-15602) Clean up using directives in cc files.

2017-05-30 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-15602:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HBASE-14850
   Status: Resolved  (was: Patch Available)

Pushed this to the branch. Thanks for the patch [~Scott]. I had to do two 
trivial fixes to the patch because some -test files were causing compilation 
failures. 

For next time, you can use 
{code}
buck test --all 
{code}
to compile and execute all the unit tests (Makefile does not build the tests 
yet). The buck is best run on the docker environment. 

> Clean up using directives in cc files.
> --
>
> Key: HBASE-15602
> URL: https://issues.apache.org/jira/browse/HBASE-15602
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-14850
>Reporter: Elliott Clark
>Assignee: Scott Hunt
>  Labels: beginner, easy, starter
> Fix For: HBASE-14850
>
> Attachments: HBASE-15602.HBASE-14850.patch, 
> HBASE-15602.HBASE-14850.v2.patch, HBASE-15602.HBASE-14850.v3.patch
>
>
> There's a ton of files that just barf out all of folly, wangle, and hbase 
> into the global namespace. We should use the using directive better than that 
> when possible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18054) log when we add/remove failed servers in client

2017-05-30 Thread Ali (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030353#comment-16030353
 ] 

Ali commented on HBASE-18054:
-

[~apurtell] I cant seem to attach a patch at the moment. I think I need to be 
assigned this task


> log when we add/remove failed servers in client
> ---
>
> Key: HBASE-18054
> URL: https://issues.apache.org/jira/browse/HBASE-18054
> Project: HBase
>  Issue Type: Bug
>  Components: Client, Operability
>Affects Versions: 2.0.0, 1.1.0, 1.2.0, 1.3.0, 1.4.0
>Reporter: Sean Busbey
>
> Currently we log if a server is in the failed server list when we go to 
> connect to it, but we don't log anything about when the server got into the 
> list.
> This means we have to search the log for errors involving the same server 
> name that (hopefully) managed to get into the log within 
> {{FAILED_SERVER_EXPIRY_KEY}} milliseconds earlier (default 2 seconds).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18054) log when we add/remove failed servers in client

2017-05-30 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030348#comment-16030348
 ] 

Andrew Purtell commented on HBASE-18054:


[~aky] Did you forget to attach a patch?

> log when we add/remove failed servers in client
> ---
>
> Key: HBASE-18054
> URL: https://issues.apache.org/jira/browse/HBASE-18054
> Project: HBase
>  Issue Type: Bug
>  Components: Client, Operability
>Affects Versions: 2.0.0, 1.1.0, 1.2.0, 1.3.0, 1.4.0
>Reporter: Sean Busbey
>
> Currently we log if a server is in the failed server list when we go to 
> connect to it, but we don't log anything about when the server got into the 
> list.
> This means we have to search the log for errors involving the same server 
> name that (hopefully) managed to get into the log within 
> {{FAILED_SERVER_EXPIRY_KEY}} milliseconds earlier (default 2 seconds).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15602) Clean up using directives in cc files.

2017-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030311#comment-16030311
 ] 

Hadoop QA commented on HBASE-15602:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} docker {color} | {color:blue} 0m 13s 
{color} | {color:blue} Dockerfile 
'/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/docker/Dockerfile'
 not found, falling back to built-in. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 7m 48s 
{color} | {color:red} Docker failed to build yetus/hbase:date2017-05-30. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870460/HBASE-15602.HBASE-14850.v3.patch
 |
| JIRA Issue | HBASE-15602 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7005/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Clean up using directives in cc files.
> --
>
> Key: HBASE-15602
> URL: https://issues.apache.org/jira/browse/HBASE-15602
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-14850
>Reporter: Elliott Clark
>Assignee: Scott Hunt
>  Labels: beginner, easy, starter
> Attachments: HBASE-15602.HBASE-14850.patch, 
> HBASE-15602.HBASE-14850.v2.patch, HBASE-15602.HBASE-14850.v3.patch
>
>
> There's a ton of files that just barf out all of folly, wangle, and hbase 
> into the global namespace. We should use the using directive better than that 
> when possible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-15602) Clean up using directives in cc files.

2017-05-30 Thread Scott Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Hunt updated HBASE-15602:
---
Attachment: HBASE-15602.HBASE-14850.v3.patch

And of course, I jumped the gun before reading your entire comment.  Sorry 
about that.
Thanks for the tip about format-code.sh.

patch v3 is after running format-code.sh

> Clean up using directives in cc files.
> --
>
> Key: HBASE-15602
> URL: https://issues.apache.org/jira/browse/HBASE-15602
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-14850
>Reporter: Elliott Clark
>Assignee: Scott Hunt
>  Labels: beginner, easy, starter
> Attachments: HBASE-15602.HBASE-14850.patch, 
> HBASE-15602.HBASE-14850.v2.patch, HBASE-15602.HBASE-14850.v3.patch
>
>
> There's a ton of files that just barf out all of folly, wangle, and hbase 
> into the global namespace. We should use the using directive better than that 
> when possible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-15602) Clean up using directives in cc files.

2017-05-30 Thread Scott Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Hunt updated HBASE-15602:
---
Attachment: HBASE-15602.HBASE-14850.v2.patch

Here's a rebased version (last commit = 
517090a09a252b28cfda7083c91e53a4888bf9e2, HBASE-17860)
A couple things changed upstream, but not much.

> Clean up using directives in cc files.
> --
>
> Key: HBASE-15602
> URL: https://issues.apache.org/jira/browse/HBASE-15602
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-14850
>Reporter: Elliott Clark
>Assignee: Scott Hunt
>  Labels: beginner, easy, starter
> Attachments: HBASE-15602.HBASE-14850.patch, 
> HBASE-15602.HBASE-14850.v2.patch
>
>
> There's a ton of files that just barf out all of folly, wangle, and hbase 
> into the global namespace. We should use the using directive better than that 
> when possible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-16392) Backup delete fault tolerance

2017-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030280#comment-16030280
 ] 

Hadoop QA commented on HBASE-16392:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 4s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
36s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
46s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
38s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
26m 40s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 114m 13s 
{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
34s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 173m 39s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.13.1 Server=1.13.1 Image:yetus/hbase:757bf37 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870424/HBASE-16392-v2.patch |
| JIRA Issue | HBASE-16392 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux c5f8c102d858 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 6846b03 |
| Default Java | 1.8.0_131 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7002/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7002/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Backup delete fault tolerance
> -
>
> Key: HBASE-16392
> URL: https://issues.apache.org/jira/browse/HBASE-16392
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: HBASE-16392-v1.patch, HBASE-16392-v2.patch
>
>
> Backup delete modified file system and 

[jira] [Commented] (HBASE-17707) New More Accurate Table Skew cost function/generator

2017-05-30 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030265#comment-16030265
 ] 

Ted Yu commented on HBASE-17707:


Kahlil:
Can you outline the change(s) in patch v13 which makes the balancer test more 
stable ?

> New More Accurate Table Skew cost function/generator
> 
>
> Key: HBASE-17707
> URL: https://issues.apache.org/jira/browse/HBASE-17707
> Project: HBase
>  Issue Type: New Feature
>  Components: Balancer
>Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43 
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
>Reporter: Kahlil Oppenheimer
>Assignee: Kahlil Oppenheimer
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-17707-00.patch, HBASE-17707-01.patch, 
> HBASE-17707-02.patch, HBASE-17707-03.patch, HBASE-17707-04.patch, 
> HBASE-17707-05.patch, HBASE-17707-06.patch, HBASE-17707-07.patch, 
> HBASE-17707-08.patch, HBASE-17707-09.patch, HBASE-17707-11.patch, 
> HBASE-17707-11.patch, HBASE-17707-12.patch, HBASE-17707-13.patch, 
> test-balancer2-13617.out
>
>
> This patch includes new version of the TableSkewCostFunction and a new 
> TableSkewCandidateGenerator.
> The new TableSkewCostFunction computes table skew by counting the minimal 
> number of region moves required for a given table to perfectly balance the 
> table across the cluster (i.e. as if the regions from that table had been 
> round-robin-ed across the cluster). This number of moves is computer for each 
> table, then normalized to a score between 0-1 by dividing by the number of 
> moves required in the absolute worst case (i.e. the entire table is stored on 
> one server), and stored in an array. The cost function then takes a weighted 
> average of the average and maximum value across all tables. The weights in 
> this average are configurable to allow for certain users to more strongly 
> penalize situations where one table is skewed versus where every table is a 
> little bit skewed. To better spread this value more evenly across the range 
> 0-1, we take the square root of the weighted average to get the final value.
> The new TableSkewCandidateGenerator generates region moves/swaps to optimize 
> the above TableSkewCostFunction. It first simply tries to move regions until 
> each server has the right number of regions, then it swaps regions around 
> such that each region swap improves table skew across the cluster.
> We tested the cost function and generator in our production clusters with 
> 100s of TBs of data and 100s of tables across dozens of servers and found 
> both to be very performant and accurate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HBASE-17988) get-active-master.rb and draining_servers.rb no longer work

2017-05-30 Thread Chinmay Kulkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni reassigned HBASE-17988:


Assignee: Chinmay Kulkarni

> get-active-master.rb and draining_servers.rb no longer work
> ---
>
> Key: HBASE-17988
> URL: https://issues.apache.org/jira/browse/HBASE-17988
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Assignee: Chinmay Kulkarni
>Priority: Critical
> Fix For: 2.0.0
>
>
> The scripts {{bin/get-active-master.rb}} and {{bin/draining_servers.rb}} no 
> longer work on current master branch. Here is an example error message:
> {noformat}
> $ bin/hbase-jruby bin/get-active-master.rb 
> NoMethodError: undefined method `masterAddressZNode' for 
> #
>at bin/get-active-master.rb:35
> {noformat}
> My initial probing suggests that this is likely due to movement that happened 
> in HBASE-16690. Perhaps instead of reworking the ruby, there is similar Java 
> functionality already existing somewhere.
> Putting priority at critical since it's impossible to know whether users rely 
> on the scripts.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15602) Clean up using directives in cc files.

2017-05-30 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030228#comment-16030228
 ] 

Enis Soztutar commented on HBASE-15602:
---

[~Scott] thanks for working on this. It is a big patch, let me check whether it 
still applies. Probably needs rebasing. 

bq. I had to make a some guesses at what might be the desired style. Feel free 
to tell me where I need to make more changes. (i.e. I may have ended up 
under-using using statements for someone's taste.)
The standard practice is to use the clang-format tool to format the patches 
automatically so that everyone will use the same exact styling. There is a 
script under {{bin/format-code.sh}} which can be run inside the docker 
environment (bin/start-docker.sh). 

bq. I didn't touch any -test.cc files. I can do these if desired.
These are lower priority, but the cleaner the better. We can do a different 
issue. 

bq. While compiling, I encountered quite a few gcc warnings (primarily on 
constructor initializer order.) I have a further patch which attempts to clean 
all those up.
Sounds good. The patch is pretty big anyways, let's do a follow up patch. 

> Clean up using directives in cc files.
> --
>
> Key: HBASE-15602
> URL: https://issues.apache.org/jira/browse/HBASE-15602
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-14850
>Reporter: Elliott Clark
>Assignee: Scott Hunt
>  Labels: beginner, easy, starter
> Attachments: HBASE-15602.HBASE-14850.patch
>
>
> There's a ton of files that just barf out all of folly, wangle, and hbase 
> into the global namespace. We should use the using directive better than that 
> when possible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18027) Replication should respect RPC size limits when batching edits

2017-05-30 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-18027:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> Replication should respect RPC size limits when batching edits
> --
>
> Key: HBASE-18027
> URL: https://issues.apache.org/jira/browse/HBASE-18027
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.4.0, 1.3.1
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-18027-branch-1.patch, HBASE-18027-branch-1.patch, 
> HBASE-18027-branch-1.patch, HBASE-18027-branch-1.patch, HBASE-18027.patch, 
> HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch, 
> HBASE-18027.patch, HBASE-18027.patch, HBASE-18027.patch
>
>
> In HBaseInterClusterReplicationEndpoint#replicate we try to replicate in 
> batches. We create N lists. N is the minimum of configured replicator 
> threads, number of 100-waledit batches, or number of current sinks. Every 
> pending entry in the replication context is then placed in order by hash of 
> encoded region name into one of these N lists. Each of the N lists is then 
> sent all at once in one replication RPC. We do not test if the sum of data in 
> each N list will exceed RPC size limits. This code presumes each individual 
> edit is reasonably small. Not checking for aggregate size while assembling 
> the lists into RPCs is an oversight and can lead to replication failure when 
> that assumption is violated.
> We can fix this by generating as many replication RPC calls as we need to 
> drain a list, keeping each RPC under limit, instead of assuming the whole 
> list will fit in one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17707) New More Accurate Table Skew cost function/generator

2017-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030218#comment-16030218
 ] 

Hadoop QA commented on HBASE-17707:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
53s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
43s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 1s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
31m 55s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 111m 51s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 
37s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 160m 26s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.mapreduce.TestMultiTableInputFormat |
|   | hadoop.hbase.snapshot.TestMobRestoreFlushSnapshotFromClient |
|   | hadoop.hbase.mapreduce.TestImportTSVWithTTLs |
|   | hadoop.hbase.mapreduce.TestLoadIncrementalHFiles |
|   | hadoop.hbase.master.procedure.TestSafemodeBringsDownMaster |
|   | hadoop.hbase.mapreduce.TestImportTSVWithVisibilityLabels |
|   | hadoop.hbase.master.procedure.TestDeleteColumnFamilyProcedure |
|   | hadoop.hbase.mapreduce.TestTableInputFormatScan2 |
|   | hadoop.hbase.TestNamespace |
| Timed out junit tests | 
org.apache.hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures |
|   | org.apache.hadoop.hbase.TestClusterBootOrder |
|   | org.apache.hadoop.hbase.snapshot.TestMobSecureExportSnapshot |
|   | org.apache.hadoop.hbase.mapreduce.TestTableMapReduce |
|   | org.apache.hadoop.hbase.snapshot.TestSecureExportSnapshot |
|   | 
org.apache.hadoop.hbase.master.procedure.TestDeleteColumnFamilyProcedureFromClient
 |
|   | org.apache.hadoop.hbase.snapshot.TestMobExportSnapshot |
|   | org.apache.hadoop.hbase.snapshot.TestExportSnapshot |
|   | org.apache.hadoop.hbase.mapreduce.TestImportExport |
|   | org.apache.hadoop.hbase.mapreduce.TestMultiTableSnapshotInputFormat |
|   | org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper |
|   | org.apache.hadoop.hbase.mapreduce.TestTableSnapshotInputFormat |
|   | org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1 |
|   | 

[jira] [Updated] (HBASE-16148) Hybrid Logical Clocks(placeholder for running tests)

2017-05-30 Thread Amit Patel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Patel updated HBASE-16148:
---
Attachment: HBASE-16148.master.005.patch

> Hybrid Logical Clocks(placeholder for running tests)
> 
>
> Key: HBASE-16148
> URL: https://issues.apache.org/jira/browse/HBASE-16148
> Project: HBase
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sai Teja Ranuva
>Assignee: Sai Teja Ranuva
>Priority: Minor
>  Labels: test-patch
> Attachments: HBASE-16148.master.001.patch, 
> HBASE-16148.master.002.patch, HBASE-16148.master.003.patch, 
> HBASE-16148.master.004.patch, HBASE-16148.master.005.patch, 
> HBASE-16148.master.6.patch, HBASE-16148.master.test.1.patch, 
> HBASE-16148.master.test.2.patch, HBASE-16148.master.test.3.patch, 
> HBASE-16148.master.test.4.patch, HBASE-16148.master.test.5.patch, 
> HLC.10.1.patch, HLC.10.2.patch, HLC.10.3.patch, HLC.10.4.patch, 
> HLC.10.5.patch, HLC.10.6.patch, HLC.10.7.patch, HLC.10.patch, HLC.1.patch, 
> HLC.2.patch, HLC.3.patch, HLC.4.patch, HLC.5.patch, HLC.6.patch, HLC.8.patch, 
> HLC.9.patch, HLC.patch
>
>
> This JIRA is just a placeholder to test Hybrid Logical Clocks code.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18078) [C++] Harden RPC by handling various communication abnormalities

2017-05-30 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030188#comment-16030188
 ] 

Enis Soztutar commented on HBASE-18078:
---

- You are still calling Future::get() here in the 
{{ConnectionFactory::AsyncConnect}}, no? 
{code}
+auto pipeline = client->connect(
+SocketAddress(hostname, port, true),
+std::chrono::duration_cast(connect_timeout_)).get();
{code}
- If this patch is not changing the blocking nature of our TCP connection 
establishment, maybe we should not introduce these methods, especially 
{{ConnectionPool::AsyncGetNewConnection}} which seems to be a copy of the other 
method. 

> [C++] Harden RPC by handling various communication abnormalities
> 
>
> Key: HBASE-18078
> URL: https://issues.apache.org/jira/browse/HBASE-18078
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HBASE-18078.000.patch, HBASE-18078.001.patch, 
> HBASE-18078.002.patch
>
>
> RPC layer should handle various communication abnormalities (e.g. connection 
> timeout, server aborted connection, and so on). Ideally, the corresponding 
> exceptions should be raised and propagated through handlers of pipeline in 
> client.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HBASE-17860) Implement secure native client connection

2017-05-30 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-17860.

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HBASE-14850

Thanks for the reviews, Elliott, Devaraj and Enis.

> Implement secure native client connection
> -
>
> Key: HBASE-17860
> URL: https://issues.apache.org/jira/browse/HBASE-17860
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
>  Labels: native
> Fix For: HBASE-14850
>
> Attachments: 17860.v21.txt, 17860.v2.txt, 17860.v3.txt, 
> 17860.v43.txt, 17860.v4.txt
>
>
> So far, the native client communicates with insecure cluster.
> This JIRA is to add secure connection support for native client using Cyrus 
> library.
> The work is based on earlier implementation and is redone via wangle and 
> folly frameworks.
> Thanks to [~devaraj] who started the initiative.
> Here is high level description of the design:
> * SaslHandler is declared as:
> {code}
> class SaslHandler
> : public wangle::HandlerAdapter std::unique_ptr>{
> {code}
> It would be inserted between EventBaseHandler and 
> LengthFieldBasedFrameDecoder in the pipeline (via 
> RpcPipelineFactory::newPipeline())
> * SaslHandler would intercept writes to server by buffering the IOBuf's and 
> start the handshake process (via sasl_client_XX calls provided by Cyrus)
> * after handshake is complete, SaslHandler would send the buffered IOBuf's to 
> server and act as pass-thru from then on



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14070) Hybrid Logical Clocks for HBase

2017-05-30 Thread Amit Patel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Patel updated HBASE-14070:
---
Attachment: (was: HBASE-14070.master.002.patch)

> Hybrid Logical Clocks for HBase
> ---
>
> Key: HBASE-14070
> URL: https://issues.apache.org/jira/browse/HBASE-14070
> Project: HBase
>  Issue Type: New Feature
>Reporter: Enis Soztutar
>Assignee: Amit Patel
> Attachments: HBASE-14070.master.001.patch, 
> HybridLogicalClocksforHBaseandPhoenix.docx, 
> HybridLogicalClocksforHBaseandPhoenix.pdf
>
>
> HBase and Phoenix uses systems physical clock (PT) to give timestamps to 
> events (read and writes). This works mostly when the system clock is strictly 
> monotonically increasing and there is no cross-dependency between servers 
> clocks. However we know that leap seconds, general clock skew and clock drift 
> are in fact real. 
> This jira proposes using Hybrid Logical Clocks (HLC) as an implementation of 
> hybrid physical clock + a logical clock. HLC is best of both worlds where it 
> keeps causality relationship similar to logical clocks, but still is 
> compatible with NTP based physical system clock. HLC can be represented in 
> 64bits. 
> A design document is attached and also can be found here: 
> https://docs.google.com/document/d/1LL2GAodiYi0waBz5ODGL4LDT4e_bXy8P9h6kWC05Bhw/edit#



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14070) Hybrid Logical Clocks for HBase

2017-05-30 Thread Amit Patel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Patel updated HBASE-14070:
---
Attachment: HBASE-14070.master.002.patch

> Hybrid Logical Clocks for HBase
> ---
>
> Key: HBASE-14070
> URL: https://issues.apache.org/jira/browse/HBASE-14070
> Project: HBase
>  Issue Type: New Feature
>Reporter: Enis Soztutar
>Assignee: Amit Patel
> Attachments: HBASE-14070.master.001.patch, 
> HBASE-14070.master.002.patch, HybridLogicalClocksforHBaseandPhoenix.docx, 
> HybridLogicalClocksforHBaseandPhoenix.pdf
>
>
> HBase and Phoenix uses systems physical clock (PT) to give timestamps to 
> events (read and writes). This works mostly when the system clock is strictly 
> monotonically increasing and there is no cross-dependency between servers 
> clocks. However we know that leap seconds, general clock skew and clock drift 
> are in fact real. 
> This jira proposes using Hybrid Logical Clocks (HLC) as an implementation of 
> hybrid physical clock + a logical clock. HLC is best of both worlds where it 
> keeps causality relationship similar to logical clocks, but still is 
> compatible with NTP based physical system clock. HLC can be represented in 
> 64bits. 
> A design document is attached and also can be found here: 
> https://docs.google.com/document/d/1LL2GAodiYi0waBz5ODGL4LDT4e_bXy8P9h6kWC05Bhw/edit#



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17860) Implement secure native client connection

2017-05-30 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030157#comment-16030157
 ] 

Enis Soztutar commented on HBASE-17860:
---

+1 for the patch in RB. 

> Implement secure native client connection
> -
>
> Key: HBASE-17860
> URL: https://issues.apache.org/jira/browse/HBASE-17860
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
>  Labels: native
> Attachments: 17860.v21.txt, 17860.v2.txt, 17860.v3.txt, 
> 17860.v43.txt, 17860.v4.txt
>
>
> So far, the native client communicates with insecure cluster.
> This JIRA is to add secure connection support for native client using Cyrus 
> library.
> The work is based on earlier implementation and is redone via wangle and 
> folly frameworks.
> Thanks to [~devaraj] who started the initiative.
> Here is high level description of the design:
> * SaslHandler is declared as:
> {code}
> class SaslHandler
> : public wangle::HandlerAdapter std::unique_ptr>{
> {code}
> It would be inserted between EventBaseHandler and 
> LengthFieldBasedFrameDecoder in the pipeline (via 
> RpcPipelineFactory::newPipeline())
> * SaslHandler would intercept writes to server by buffering the IOBuf's and 
> start the handshake process (via sasl_client_XX calls provided by Cyrus)
> * after handshake is complete, SaslHandler would send the buffered IOBuf's to 
> server and act as pass-thru from then on



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17959) Canary timeout should be configurable on a per-table basis

2017-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030097#comment-16030097
 ] 

Hadoop QA commented on HBASE-17959:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 12m 41s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
29s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
46s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
45s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
28m 12s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 115m 24s 
{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 172m 0s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.13.1 Server=1.13.1 Image:yetus/hbase:757bf37 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869342/HBASE-17959.003.patch 
|
| JIRA Issue | HBASE-17959 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 02979546bc4e 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 6846b03 |
| Default Java | 1.8.0_131 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6998/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6998/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Canary timeout should be configurable on a per-table basis
> --
>
> Key: HBASE-17959
> URL: https://issues.apache.org/jira/browse/HBASE-17959
> Project: HBase
>  Issue Type: Improvement
>  Components: canary
>Reporter: Andrew Purtell
>Assignee: Chinmay Kulkarni
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> 

[jira] [Updated] (HBASE-14614) Procedure v2: Core Assignment Manager

2017-05-30 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14614:
--
Attachment: HBASE-14614.master.048.patch

> Procedure v2: Core Assignment Manager
> -
>
> Key: HBASE-14614
> URL: https://issues.apache.org/jira/browse/HBASE-14614
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0
>Reporter: Stephen Yuan Jiang
>Assignee: Matteo Bertozzi
> Fix For: 2.0.0
>
> Attachments: HBASE-14614.master.003.patch, 
> HBASE-14614.master.004.patch, HBASE-14614.master.005.patch, 
> HBASE-14614.master.006.patch, HBASE-14614.master.007.patch, 
> HBASE-14614.master.008.patch, HBASE-14614.master.009.patch, 
> HBASE-14614.master.010.patch, HBASE-14614.master.012.patch, 
> HBASE-14614.master.013.patch, HBASE-14614.master.014.patch, 
> HBASE-14614.master.015.patch, HBASE-14614.master.017.patch, 
> HBASE-14614.master.018.patch, HBASE-14614.master.019.patch, 
> HBASE-14614.master.020.patch, HBASE-14614.master.022.patch, 
> HBASE-14614.master.023.patch, HBASE-14614.master.024.patch, 
> HBASE-14614.master.025.patch, HBASE-14614.master.026.patch, 
> HBASE-14614.master.027.patch, HBASE-14614.master.028.patch, 
> HBASE-14614.master.029.patch, HBASE-14614.master.030.patch, 
> HBASE-14614.master.033.patch, HBASE-14614.master.038.patch, 
> HBASE-14614.master.039.patch, HBASE-14614.master.040.patch, 
> HBASE-14614.master.041.patch, HBASE-14614.master.042.patch, 
> HBASE-14614.master.043.patch, HBASE-14614.master.044.patch, 
> HBASE-14614.master.045.patch, HBASE-14614.master.045.patch, 
> HBASE-14614.master.046.patch, HBASE-14614.master.047.patch, 
> HBASE-14614.master.048.patch
>
>
> New AssignmentManager implemented using proc-v2.
>  - AssignProcedure handle assignment operation
>  - UnassignProcedure handle unassign operation
>  - MoveRegionProcedure handle move/balance operation
> Concurrent Assign operations are batched together and sent to the balancer
> Concurrent Assign and Unassign operation ready to be sent to the RS are 
> batched together
> This patch is an intermediate state where we add the new AM as 
> AssignmentManager2() to the master, to be reached by tests. but the new AM 
> will not be integrated with the rest of the system. Only new am unit-tests 
> will exercise the new assigment manager. The integration with the master code 
> is part of HBASE-14616



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-05-30 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030045#comment-16030045
 ] 

Enis Soztutar commented on HBASE-15160:
---

bq. Previously the concern on readAtOffset completely make sense, but 
HBASE-17917 has removed the stream lock so no more stream read when pread is 
true, which makes it possible to move the updating of the metrics up to the 
caller (smile).
Agreed that with the stream lock gone, we can always know when it was a pread 
and when it was not from the caller. However, why do you think that updating 
the metrics should be pulled up the stack? Since there are no other 
synchronization points, they are equal in terms of cost. The reason I wanted to 
be pushed down the stack is that in some cases (for example checksum failure) 
we are doing two reads transparently to the caller. The metrics pulled up the 
stack will be incorrect slightly when things like this happens. Also I want to 
backport this to branch-1, so keeping the metrics update here should give us 
better portability of future patches. 


> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-16392) Backup delete fault tolerance

2017-05-30 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16029986#comment-16029986
 ] 

Vladimir Rodionov commented on HBASE-16392:
---

Review Board link:
https://reviews.apache.org/r/59646/

> Backup delete fault tolerance
> -
>
> Key: HBASE-16392
> URL: https://issues.apache.org/jira/browse/HBASE-16392
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: HBASE-16392-v1.patch, HBASE-16392-v2.patch
>
>
> Backup delete modified file system and backup system table. We have to make 
> sure that operation is atomic, durable and isolated.
> Delete operation:
> # Start backup session (this guarantees) that system will be blocked for all 
> backup commands during delete operation
> # Save list of tables being deleted to system table
> # Before delete operation we take backup system table snapshot  
> # During delete operation we detect any failures and restore backup system 
> table from snapshot, then finish backup session
> # To guarantee consistency of the data, delete operation MUST be repeated
> # We guarantee that all file delete operations are idempotent, can be 
> repeated multiple times
> # Any backup operations will be blocked until consistency is restored
> # To restore consistency, repair command must be executed.
> # Repair command checks if there is failed delete op in a backup system 
> table, and repeats delete operation



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-16392) Backup delete fault tolerance

2017-05-30 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-16392:
--
Attachment: HBASE-16392-v2.patch

v2. cc: [~te...@apache.org]

> Backup delete fault tolerance
> -
>
> Key: HBASE-16392
> URL: https://issues.apache.org/jira/browse/HBASE-16392
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: HBASE-16392-v1.patch, HBASE-16392-v2.patch
>
>
> Backup delete modified file system and backup system table. We have to make 
> sure that operation is atomic, durable and isolated.
> Delete operation:
> # Start backup session (this guarantees) that system will be blocked for all 
> backup commands during delete operation
> # Save list of tables being deleted to system table
> # Before delete operation we take backup system table snapshot  
> # During delete operation we detect any failures and restore backup system 
> table from snapshot, then finish backup session
> # To guarantee consistency of the data, delete operation MUST be repeated
> # We guarantee that all file delete operations are idempotent, can be 
> repeated multiple times
> # Any backup operations will be blocked until consistency is restored
> # To restore consistency, repair command must be executed.
> # Repair command checks if there is failed delete op in a backup system 
> table, and repeats delete operation



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15602) Clean up using directives in cc files.

2017-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16029967#comment-16029967
 ] 

Hadoop QA commented on HBASE-15602:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} docker {color} | {color:blue} 0m 13s 
{color} | {color:blue} Dockerfile 
'/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/docker/Dockerfile'
 not found, falling back to built-in. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 5m 19s 
{color} | {color:red} Docker failed to build yetus/hbase:date2017-05-30. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869343/HBASE-15602.HBASE-14850.patch
 |
| JIRA Issue | HBASE-15602 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6999/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Clean up using directives in cc files.
> --
>
> Key: HBASE-15602
> URL: https://issues.apache.org/jira/browse/HBASE-15602
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-14850
>Reporter: Elliott Clark
>Assignee: Scott Hunt
>  Labels: beginner, easy, starter
> Attachments: HBASE-15602.HBASE-14850.patch
>
>
> There's a ton of files that just barf out all of folly, wangle, and hbase 
> into the global namespace. We should use the using directive better than that 
> when possible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17707) New More Accurate Table Skew cost function/generator

2017-05-30 Thread Kahlil Oppenheimer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kahlil Oppenheimer updated HBASE-17707:
---
Status: Open  (was: Patch Available)

> New More Accurate Table Skew cost function/generator
> 
>
> Key: HBASE-17707
> URL: https://issues.apache.org/jira/browse/HBASE-17707
> Project: HBase
>  Issue Type: New Feature
>  Components: Balancer
>Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43 
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
>Reporter: Kahlil Oppenheimer
>Assignee: Kahlil Oppenheimer
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-17707-00.patch, HBASE-17707-01.patch, 
> HBASE-17707-02.patch, HBASE-17707-03.patch, HBASE-17707-04.patch, 
> HBASE-17707-05.patch, HBASE-17707-06.patch, HBASE-17707-07.patch, 
> HBASE-17707-08.patch, HBASE-17707-09.patch, HBASE-17707-11.patch, 
> HBASE-17707-11.patch, HBASE-17707-12.patch, HBASE-17707-13.patch, 
> test-balancer2-13617.out
>
>
> This patch includes new version of the TableSkewCostFunction and a new 
> TableSkewCandidateGenerator.
> The new TableSkewCostFunction computes table skew by counting the minimal 
> number of region moves required for a given table to perfectly balance the 
> table across the cluster (i.e. as if the regions from that table had been 
> round-robin-ed across the cluster). This number of moves is computer for each 
> table, then normalized to a score between 0-1 by dividing by the number of 
> moves required in the absolute worst case (i.e. the entire table is stored on 
> one server), and stored in an array. The cost function then takes a weighted 
> average of the average and maximum value across all tables. The weights in 
> this average are configurable to allow for certain users to more strongly 
> penalize situations where one table is skewed versus where every table is a 
> little bit skewed. To better spread this value more evenly across the range 
> 0-1, we take the square root of the weighted average to get the final value.
> The new TableSkewCandidateGenerator generates region moves/swaps to optimize 
> the above TableSkewCostFunction. It first simply tries to move regions until 
> each server has the right number of regions, then it swaps regions around 
> such that each region swap improves table skew across the cluster.
> We tested the cost function and generator in our production clusters with 
> 100s of TBs of data and 100s of tables across dozens of servers and found 
> both to be very performant and accurate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17707) New More Accurate Table Skew cost function/generator

2017-05-30 Thread Kahlil Oppenheimer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kahlil Oppenheimer updated HBASE-17707:
---
Attachment: HBASE-17707-13.patch

> New More Accurate Table Skew cost function/generator
> 
>
> Key: HBASE-17707
> URL: https://issues.apache.org/jira/browse/HBASE-17707
> Project: HBase
>  Issue Type: New Feature
>  Components: Balancer
>Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43 
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
>Reporter: Kahlil Oppenheimer
>Assignee: Kahlil Oppenheimer
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-17707-00.patch, HBASE-17707-01.patch, 
> HBASE-17707-02.patch, HBASE-17707-03.patch, HBASE-17707-04.patch, 
> HBASE-17707-05.patch, HBASE-17707-06.patch, HBASE-17707-07.patch, 
> HBASE-17707-08.patch, HBASE-17707-09.patch, HBASE-17707-11.patch, 
> HBASE-17707-11.patch, HBASE-17707-12.patch, HBASE-17707-13.patch, 
> test-balancer2-13617.out
>
>
> This patch includes new version of the TableSkewCostFunction and a new 
> TableSkewCandidateGenerator.
> The new TableSkewCostFunction computes table skew by counting the minimal 
> number of region moves required for a given table to perfectly balance the 
> table across the cluster (i.e. as if the regions from that table had been 
> round-robin-ed across the cluster). This number of moves is computer for each 
> table, then normalized to a score between 0-1 by dividing by the number of 
> moves required in the absolute worst case (i.e. the entire table is stored on 
> one server), and stored in an array. The cost function then takes a weighted 
> average of the average and maximum value across all tables. The weights in 
> this average are configurable to allow for certain users to more strongly 
> penalize situations where one table is skewed versus where every table is a 
> little bit skewed. To better spread this value more evenly across the range 
> 0-1, we take the square root of the weighted average to get the final value.
> The new TableSkewCandidateGenerator generates region moves/swaps to optimize 
> the above TableSkewCostFunction. It first simply tries to move regions until 
> each server has the right number of regions, then it swaps regions around 
> such that each region swap improves table skew across the cluster.
> We tested the cost function and generator in our production clusters with 
> 100s of TBs of data and 100s of tables across dozens of servers and found 
> both to be very performant and accurate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17707) New More Accurate Table Skew cost function/generator

2017-05-30 Thread Kahlil Oppenheimer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kahlil Oppenheimer updated HBASE-17707:
---
Status: Patch Available  (was: Open)

> New More Accurate Table Skew cost function/generator
> 
>
> Key: HBASE-17707
> URL: https://issues.apache.org/jira/browse/HBASE-17707
> Project: HBase
>  Issue Type: New Feature
>  Components: Balancer
>Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43 
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
>Reporter: Kahlil Oppenheimer
>Assignee: Kahlil Oppenheimer
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-17707-00.patch, HBASE-17707-01.patch, 
> HBASE-17707-02.patch, HBASE-17707-03.patch, HBASE-17707-04.patch, 
> HBASE-17707-05.patch, HBASE-17707-06.patch, HBASE-17707-07.patch, 
> HBASE-17707-08.patch, HBASE-17707-09.patch, HBASE-17707-11.patch, 
> HBASE-17707-11.patch, HBASE-17707-12.patch, HBASE-17707-13.patch, 
> test-balancer2-13617.out
>
>
> This patch includes new version of the TableSkewCostFunction and a new 
> TableSkewCandidateGenerator.
> The new TableSkewCostFunction computes table skew by counting the minimal 
> number of region moves required for a given table to perfectly balance the 
> table across the cluster (i.e. as if the regions from that table had been 
> round-robin-ed across the cluster). This number of moves is computer for each 
> table, then normalized to a score between 0-1 by dividing by the number of 
> moves required in the absolute worst case (i.e. the entire table is stored on 
> one server), and stored in an array. The cost function then takes a weighted 
> average of the average and maximum value across all tables. The weights in 
> this average are configurable to allow for certain users to more strongly 
> penalize situations where one table is skewed versus where every table is a 
> little bit skewed. To better spread this value more evenly across the range 
> 0-1, we take the square root of the weighted average to get the final value.
> The new TableSkewCandidateGenerator generates region moves/swaps to optimize 
> the above TableSkewCostFunction. It first simply tries to move regions until 
> each server has the right number of regions, then it swaps regions around 
> such that each region swap improves table skew across the cluster.
> We tested the cost function and generator in our production clusters with 
> 100s of TBs of data and 100s of tables across dozens of servers and found 
> both to be very performant and accurate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17707) New More Accurate Table Skew cost function/generator

2017-05-30 Thread Kahlil Oppenheimer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kahlil Oppenheimer updated HBASE-17707:
---
Attachment: (was: HBASE-17707-13.patch)

> New More Accurate Table Skew cost function/generator
> 
>
> Key: HBASE-17707
> URL: https://issues.apache.org/jira/browse/HBASE-17707
> Project: HBase
>  Issue Type: New Feature
>  Components: Balancer
>Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43 
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
>Reporter: Kahlil Oppenheimer
>Assignee: Kahlil Oppenheimer
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-17707-00.patch, HBASE-17707-01.patch, 
> HBASE-17707-02.patch, HBASE-17707-03.patch, HBASE-17707-04.patch, 
> HBASE-17707-05.patch, HBASE-17707-06.patch, HBASE-17707-07.patch, 
> HBASE-17707-08.patch, HBASE-17707-09.patch, HBASE-17707-11.patch, 
> HBASE-17707-11.patch, HBASE-17707-12.patch, HBASE-17707-13.patch, 
> test-balancer2-13617.out
>
>
> This patch includes new version of the TableSkewCostFunction and a new 
> TableSkewCandidateGenerator.
> The new TableSkewCostFunction computes table skew by counting the minimal 
> number of region moves required for a given table to perfectly balance the 
> table across the cluster (i.e. as if the regions from that table had been 
> round-robin-ed across the cluster). This number of moves is computer for each 
> table, then normalized to a score between 0-1 by dividing by the number of 
> moves required in the absolute worst case (i.e. the entire table is stored on 
> one server), and stored in an array. The cost function then takes a weighted 
> average of the average and maximum value across all tables. The weights in 
> this average are configurable to allow for certain users to more strongly 
> penalize situations where one table is skewed versus where every table is a 
> little bit skewed. To better spread this value more evenly across the range 
> 0-1, we take the square root of the weighted average to get the final value.
> The new TableSkewCandidateGenerator generates region moves/swaps to optimize 
> the above TableSkewCostFunction. It first simply tries to move regions until 
> each server has the right number of regions, then it swaps regions around 
> such that each region swap improves table skew across the cluster.
> We tested the cost function and generator in our production clusters with 
> 100s of TBs of data and 100s of tables across dozens of servers and found 
> both to be very performant and accurate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17707) New More Accurate Table Skew cost function/generator

2017-05-30 Thread Kahlil Oppenheimer (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16029941#comment-16029941
 ] 

Kahlil Oppenheimer commented on HBASE-17707:


[~enis] [~tedyu], sorry I had taken a quick break from this, but just got back 
to it. I've uploaded yet another version of the patch. Hopefully, this 
addresses all of your concerns.

> New More Accurate Table Skew cost function/generator
> 
>
> Key: HBASE-17707
> URL: https://issues.apache.org/jira/browse/HBASE-17707
> Project: HBase
>  Issue Type: New Feature
>  Components: Balancer
>Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43 
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
>Reporter: Kahlil Oppenheimer
>Assignee: Kahlil Oppenheimer
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-17707-00.patch, HBASE-17707-01.patch, 
> HBASE-17707-02.patch, HBASE-17707-03.patch, HBASE-17707-04.patch, 
> HBASE-17707-05.patch, HBASE-17707-06.patch, HBASE-17707-07.patch, 
> HBASE-17707-08.patch, HBASE-17707-09.patch, HBASE-17707-11.patch, 
> HBASE-17707-11.patch, HBASE-17707-12.patch, HBASE-17707-13.patch, 
> test-balancer2-13617.out
>
>
> This patch includes new version of the TableSkewCostFunction and a new 
> TableSkewCandidateGenerator.
> The new TableSkewCostFunction computes table skew by counting the minimal 
> number of region moves required for a given table to perfectly balance the 
> table across the cluster (i.e. as if the regions from that table had been 
> round-robin-ed across the cluster). This number of moves is computer for each 
> table, then normalized to a score between 0-1 by dividing by the number of 
> moves required in the absolute worst case (i.e. the entire table is stored on 
> one server), and stored in an array. The cost function then takes a weighted 
> average of the average and maximum value across all tables. The weights in 
> this average are configurable to allow for certain users to more strongly 
> penalize situations where one table is skewed versus where every table is a 
> little bit skewed. To better spread this value more evenly across the range 
> 0-1, we take the square root of the weighted average to get the final value.
> The new TableSkewCandidateGenerator generates region moves/swaps to optimize 
> the above TableSkewCostFunction. It first simply tries to move regions until 
> each server has the right number of regions, then it swaps regions around 
> such that each region swap improves table skew across the cluster.
> We tested the cost function and generator in our production clusters with 
> 100s of TBs of data and 100s of tables across dozens of servers and found 
> both to be very performant and accurate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17707) New More Accurate Table Skew cost function/generator

2017-05-30 Thread Kahlil Oppenheimer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kahlil Oppenheimer updated HBASE-17707:
---
Status: Patch Available  (was: In Progress)

> New More Accurate Table Skew cost function/generator
> 
>
> Key: HBASE-17707
> URL: https://issues.apache.org/jira/browse/HBASE-17707
> Project: HBase
>  Issue Type: New Feature
>  Components: Balancer
>Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43 
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
>Reporter: Kahlil Oppenheimer
>Assignee: Kahlil Oppenheimer
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-17707-00.patch, HBASE-17707-01.patch, 
> HBASE-17707-02.patch, HBASE-17707-03.patch, HBASE-17707-04.patch, 
> HBASE-17707-05.patch, HBASE-17707-06.patch, HBASE-17707-07.patch, 
> HBASE-17707-08.patch, HBASE-17707-09.patch, HBASE-17707-11.patch, 
> HBASE-17707-11.patch, HBASE-17707-12.patch, HBASE-17707-13.patch, 
> test-balancer2-13617.out
>
>
> This patch includes new version of the TableSkewCostFunction and a new 
> TableSkewCandidateGenerator.
> The new TableSkewCostFunction computes table skew by counting the minimal 
> number of region moves required for a given table to perfectly balance the 
> table across the cluster (i.e. as if the regions from that table had been 
> round-robin-ed across the cluster). This number of moves is computer for each 
> table, then normalized to a score between 0-1 by dividing by the number of 
> moves required in the absolute worst case (i.e. the entire table is stored on 
> one server), and stored in an array. The cost function then takes a weighted 
> average of the average and maximum value across all tables. The weights in 
> this average are configurable to allow for certain users to more strongly 
> penalize situations where one table is skewed versus where every table is a 
> little bit skewed. To better spread this value more evenly across the range 
> 0-1, we take the square root of the weighted average to get the final value.
> The new TableSkewCandidateGenerator generates region moves/swaps to optimize 
> the above TableSkewCostFunction. It first simply tries to move regions until 
> each server has the right number of regions, then it swaps regions around 
> such that each region swap improves table skew across the cluster.
> We tested the cost function and generator in our production clusters with 
> 100s of TBs of data and 100s of tables across dozens of servers and found 
> both to be very performant and accurate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Work started] (HBASE-17707) New More Accurate Table Skew cost function/generator

2017-05-30 Thread Kahlil Oppenheimer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-17707 started by Kahlil Oppenheimer.
--
> New More Accurate Table Skew cost function/generator
> 
>
> Key: HBASE-17707
> URL: https://issues.apache.org/jira/browse/HBASE-17707
> Project: HBase
>  Issue Type: New Feature
>  Components: Balancer
>Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43 
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
>Reporter: Kahlil Oppenheimer
>Assignee: Kahlil Oppenheimer
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-17707-00.patch, HBASE-17707-01.patch, 
> HBASE-17707-02.patch, HBASE-17707-03.patch, HBASE-17707-04.patch, 
> HBASE-17707-05.patch, HBASE-17707-06.patch, HBASE-17707-07.patch, 
> HBASE-17707-08.patch, HBASE-17707-09.patch, HBASE-17707-11.patch, 
> HBASE-17707-11.patch, HBASE-17707-12.patch, HBASE-17707-13.patch, 
> test-balancer2-13617.out
>
>
> This patch includes new version of the TableSkewCostFunction and a new 
> TableSkewCandidateGenerator.
> The new TableSkewCostFunction computes table skew by counting the minimal 
> number of region moves required for a given table to perfectly balance the 
> table across the cluster (i.e. as if the regions from that table had been 
> round-robin-ed across the cluster). This number of moves is computer for each 
> table, then normalized to a score between 0-1 by dividing by the number of 
> moves required in the absolute worst case (i.e. the entire table is stored on 
> one server), and stored in an array. The cost function then takes a weighted 
> average of the average and maximum value across all tables. The weights in 
> this average are configurable to allow for certain users to more strongly 
> penalize situations where one table is skewed versus where every table is a 
> little bit skewed. To better spread this value more evenly across the range 
> 0-1, we take the square root of the weighted average to get the final value.
> The new TableSkewCandidateGenerator generates region moves/swaps to optimize 
> the above TableSkewCostFunction. It first simply tries to move regions until 
> each server has the right number of regions, then it swaps regions around 
> such that each region swap improves table skew across the cluster.
> We tested the cost function and generator in our production clusters with 
> 100s of TBs of data and 100s of tables across dozens of servers and found 
> both to be very performant and accurate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-15602) Clean up using directives in cc files.

2017-05-30 Thread Scott Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Hunt updated HBASE-15602:
---
Affects Version/s: HBASE-14850
   Status: Patch Available  (was: Open)

> Clean up using directives in cc files.
> --
>
> Key: HBASE-15602
> URL: https://issues.apache.org/jira/browse/HBASE-15602
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-14850
>Reporter: Elliott Clark
>Assignee: Scott Hunt
>  Labels: beginner, easy, starter
> Attachments: HBASE-15602.HBASE-14850.patch
>
>
> There's a ton of files that just barf out all of folly, wangle, and hbase 
> into the global namespace. We should use the using directive better than that 
> when possible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17707) New More Accurate Table Skew cost function/generator

2017-05-30 Thread Kahlil Oppenheimer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kahlil Oppenheimer updated HBASE-17707:
---
Attachment: HBASE-17707-13.patch

> New More Accurate Table Skew cost function/generator
> 
>
> Key: HBASE-17707
> URL: https://issues.apache.org/jira/browse/HBASE-17707
> Project: HBase
>  Issue Type: New Feature
>  Components: Balancer
>Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43 
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
>Reporter: Kahlil Oppenheimer
>Assignee: Kahlil Oppenheimer
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-17707-00.patch, HBASE-17707-01.patch, 
> HBASE-17707-02.patch, HBASE-17707-03.patch, HBASE-17707-04.patch, 
> HBASE-17707-05.patch, HBASE-17707-06.patch, HBASE-17707-07.patch, 
> HBASE-17707-08.patch, HBASE-17707-09.patch, HBASE-17707-11.patch, 
> HBASE-17707-11.patch, HBASE-17707-12.patch, HBASE-17707-13.patch, 
> test-balancer2-13617.out
>
>
> This patch includes new version of the TableSkewCostFunction and a new 
> TableSkewCandidateGenerator.
> The new TableSkewCostFunction computes table skew by counting the minimal 
> number of region moves required for a given table to perfectly balance the 
> table across the cluster (i.e. as if the regions from that table had been 
> round-robin-ed across the cluster). This number of moves is computer for each 
> table, then normalized to a score between 0-1 by dividing by the number of 
> moves required in the absolute worst case (i.e. the entire table is stored on 
> one server), and stored in an array. The cost function then takes a weighted 
> average of the average and maximum value across all tables. The weights in 
> this average are configurable to allow for certain users to more strongly 
> penalize situations where one table is skewed versus where every table is a 
> little bit skewed. To better spread this value more evenly across the range 
> 0-1, we take the square root of the weighted average to get the final value.
> The new TableSkewCandidateGenerator generates region moves/swaps to optimize 
> the above TableSkewCostFunction. It first simply tries to move regions until 
> each server has the right number of regions, then it swaps regions around 
> such that each region swap improves table skew across the cluster.
> We tested the cost function and generator in our production clusters with 
> 100s of TBs of data and 100s of tables across dozens of servers and found 
> both to be very performant and accurate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HBASE-18054) log when we add/remove failed servers in client

2017-05-30 Thread Ali (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16029901#comment-16029901
 ] 

Ali edited comment on HBASE-18054 at 5/30/17 6:43 PM:
--

logging that mentions failed servers is in debug


was (Author: aky):
logging for failed server is in debug

> log when we add/remove failed servers in client
> ---
>
> Key: HBASE-18054
> URL: https://issues.apache.org/jira/browse/HBASE-18054
> Project: HBase
>  Issue Type: Bug
>  Components: Client, Operability
>Affects Versions: 2.0.0, 1.1.0, 1.2.0, 1.3.0, 1.4.0
>Reporter: Sean Busbey
>
> Currently we log if a server is in the failed server list when we go to 
> connect to it, but we don't log anything about when the server got into the 
> list.
> This means we have to search the log for errors involving the same server 
> name that (hopefully) managed to get into the log within 
> {{FAILED_SERVER_EXPIRY_KEY}} milliseconds earlier (default 2 seconds).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >