[jira] [Updated] (HBASE-17289) Avoid adding a replication peer named "lock"

2016-12-11 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17289:
---
Attachment: HBASE-17289-branch-1.patch

Attach patch for branch-1.

> Avoid adding a replication peer named "lock"
> 
>
> Key: HBASE-17289
> URL: https://issues.apache.org/jira/browse/HBASE-17289
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.3.0, 1.4.0, 1.1.7, 0.98.23, 1.2.4
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Attachments: HBASE-17289-branch-1.1.patch, 
> HBASE-17289-branch-1.2.patch, HBASE-17289-branch-1.3.patch, 
> HBASE-17289-branch-1.patch
>
>
> When zk based replication queue is used and useMulti is false, the steps of 
> transfer replication queues are first add a lock, then copy nodes, finally 
> clean old queue and the lock. And the default lock znode's name is "lock". So 
> we should avoid adding a peer named "lock". 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17289) Avoid adding a replication peer named "lock"

2016-12-11 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15740679#comment-15740679
 ] 

Guanghao Zhang commented on HBASE-17289:


Thanks. This variable can be used in test, too. It is a different package 
org.apache.hadoop.hbase.client.replication.

> Avoid adding a replication peer named "lock"
> 
>
> Key: HBASE-17289
> URL: https://issues.apache.org/jira/browse/HBASE-17289
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.3.0, 1.4.0, 1.1.7, 0.98.23, 1.2.4
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Attachments: HBASE-17289-branch-1.1.patch, 
> HBASE-17289-branch-1.2.patch, HBASE-17289-branch-1.3.patch, 
> HBASE-17289-branch-1.patch
>
>
> When zk based replication queue is used and useMulti is false, the steps of 
> transfer replication queues are first add a lock, then copy nodes, finally 
> clean old queue and the lock. And the default lock znode's name is "lock". So 
> we should avoid adding a peer named "lock". 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17443) Move listReplicated/enableTableRep/disableTableRep methods from ReplicationAdmin to Admin

2017-01-14 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17443:
---
Description: 
We have moved other replication requests to Admin and mark ReplicationAdmin as 
Deprecated, so listReplicated/enableTableRep/disableTableRep methods need move 
to Admin, too.

Review board: https://reviews.apache.org/r/55534/

  was:We have moved other replication requests to Admin and mark 
ReplicationAdmin as Deprecated, so 
listReplicated/enableTableRep/disableTableRep methods need move to Admin, too.


> Move listReplicated/enableTableRep/disableTableRep methods from 
> ReplicationAdmin to Admin
> -
>
> Key: HBASE-17443
> URL: https://issues.apache.org/jira/browse/HBASE-17443
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17443-v1.patch, HBASE-17443-v2.patch, 
> HBASE-17443-v2.patch, HBASE-17443-v3.patch
>
>
> We have moved other replication requests to Admin and mark ReplicationAdmin 
> as Deprecated, so listReplicated/enableTableRep/disableTableRep methods need 
> move to Admin, too.
> Review board: https://reviews.apache.org/r/55534/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17442) Move most of the replication related classes to hbase-server package

2017-01-09 Thread Guanghao Zhang (JIRA)
Guanghao Zhang created HBASE-17442:
--

 Summary: Move most of the replication related classes to 
hbase-server package
 Key: HBASE-17442
 URL: https://issues.apache.org/jira/browse/HBASE-17442
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 2.0.0
Reporter: Guanghao Zhang


After the replication requests are routed through master, replication 
implementation details didn't need be exposed to client. We should move most of 
the replication related classes to hbase-server package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17337) list replication peers request should be routed through master

2017-01-09 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17337:
---
Affects Version/s: 2.0.0

> list replication peers request should be routed through master
> --
>
> Key: HBASE-17337
> URL: https://issues.apache.org/jira/browse/HBASE-17337
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17337-v1.patch, HBASE-17337-v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17337) list replication peers request should be routed through master

2017-01-09 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17337:
---
  Resolution: Fixed
Release Note: List replication peers request will be roughed through master.
  Status: Resolved  (was: Patch Available)

Pushed to master. Thanks all for review.

> list replication peers request should be routed through master
> --
>
> Key: HBASE-17337
> URL: https://issues.apache.org/jira/browse/HBASE-17337
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17337-v1.patch, HBASE-17337-v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-17442) Move most of the replication related classes to hbase-server package

2017-01-09 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reassigned HBASE-17442:
--

Assignee: Guanghao Zhang

> Move most of the replication related classes to hbase-server package
> 
>
> Key: HBASE-17442
> URL: https://issues.apache.org/jira/browse/HBASE-17442
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
>
> After the replication requests are routed through master, replication 
> implementation details didn't need be exposed to client. We should move most 
> of the replication related classes to hbase-server package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17443) Move listReplicated/enableTableRep/disableTableRep methods from ReplicationAdmin to Admin

2017-01-10 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17443:
---
Attachment: HBASE-17443-v2.patch

Failed ut not related. Trigger the Hadoop QA again.

> Move listReplicated/enableTableRep/disableTableRep methods from 
> ReplicationAdmin to Admin
> -
>
> Key: HBASE-17443
> URL: https://issues.apache.org/jira/browse/HBASE-17443
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17443-v1.patch, HBASE-17443-v2.patch, 
> HBASE-17443-v2.patch
>
>
> We have moved other replication requests to Admin and mark ReplicationAdmin 
> as Deprecated, so listReplicated/enableTableRep/disableTableRep methods need 
> move to Admin, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17442) Move most of the replication related classes to hbase-server package

2017-01-10 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817025#comment-15817025
 ] 

Guanghao Zhang commented on HBASE-17442:


Now we don't have a hbase-replication module, it means we need a new module for 
hbase-replication. [~enis] What do you think about this?

One question: Our hadoop QA only run hbase-client ut and hbase-server ut? If we 
have a hbase-replication module, I thought the hbase-replication ut also need 
run every time?

> Move most of the replication related classes to hbase-server package
> 
>
> Key: HBASE-17442
> URL: https://issues.apache.org/jira/browse/HBASE-17442
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, Replication
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
>
> After the replication requests are routed through master, replication 
> implementation details didn't need be exposed to client. We should move most 
> of the replication related classes to hbase-server package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17443) Move listReplicated/enableTableRep/disableTableRep methods from ReplicationAdmin to Admin

2017-01-10 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817224#comment-15817224
 ] 

Guanghao Zhang commented on HBASE-17443:


[~enis] [~ashish singhi] Can you help review this? Thanks.

> Move listReplicated/enableTableRep/disableTableRep methods from 
> ReplicationAdmin to Admin
> -
>
> Key: HBASE-17443
> URL: https://issues.apache.org/jira/browse/HBASE-17443
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17443-v1.patch, HBASE-17443-v2.patch, 
> HBASE-17443-v2.patch
>
>
> We have moved other replication requests to Admin and mark ReplicationAdmin 
> as Deprecated, so listReplicated/enableTableRep/disableTableRep methods need 
> move to Admin, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14061) Support CF-level Storage Policy

2017-01-10 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817410#comment-15817410
 ] 

Guanghao Zhang commented on HBASE-14061:


[~carp84] Did this related to this failed ut?

{code}
Unable to find suitable constructor for class 
org.apache.hadoop.hbase.mob.compactions.TestPartitionedMobCompactor$FaultyDistributedFileSystem
Stacktrace

java.lang.UnsupportedOperationException: Unable to find suitable constructor 
for class 
org.apache.hadoop.hbase.mob.compactions.TestPartitionedMobCompactor$FaultyDistributedFileSystem
at 
org.apache.hadoop.hbase.util.ReflectionUtils.findConstructor(ReflectionUtils.java:103)
at 
org.apache.hadoop.hbase.util.ReflectionUtils.newInstance(ReflectionUtils.java:73)
at 
org.apache.hadoop.hbase.fs.HFileSystem.newInstanceFileSystem(HFileSystem.java:260)
at org.apache.hadoop.hbase.fs.HFileSystem.(HFileSystem.java:110)
at org.apache.hadoop.hbase.fs.HFileSystem.get(HFileSystem.java:476)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.getTestFileSystem(HBaseTestingUtility.java:2951)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.getNewDataTestDirOnTestFS(HBaseTestingUtility.java:565)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.setupDataTestDirOnTestFS(HBaseTestingUtility.java:554)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.getDataTestDirOnTestFS(HBaseTestingUtility.java:527)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.getDefaultRootDirPath(HBaseTestingUtility.java:1228)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.createRootDir(HBaseTestingUtility.java:1259)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1085)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:1057)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:929)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:911)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:898)
at 
org.apache.hadoop.hbase.mob.compactions.TestPartitionedMobCompactor.setUpBeforeClass(TestPartitionedMobCompactor.java:87)
{code}

> Support CF-level Storage Policy
> ---
>
> Key: HBASE-14061
> URL: https://issues.apache.org/jira/browse/HBASE-14061
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, regionserver
> Environment: hadoop-2.6.0
>Reporter: Victor Xu
>Assignee: Yu Li
> Fix For: 2.0.0
>
> Attachments: HBASE-14061-master-v1.patch, HBASE-14061.addendum.patch, 
> HBASE-14061.addendum.patch, HBASE-14061.v2.patch, HBASE-14061.v3.patch, 
> HBASE-14061.v4.patch
>
>
> After reading [HBASE-12848|https://issues.apache.org/jira/browse/HBASE-12848] 
> and [HBASE-12934|https://issues.apache.org/jira/browse/HBASE-12934], I wrote 
> a patch to implement cf-level storage policy. 
> My main purpose is to improve random-read performance for some really hot 
> data, which usually locates in certain column family of a big table.
> Usage:
> $ hbase shell
> > alter 'TABLE_NAME', METADATA => {'hbase.hstore.block.storage.policy' => 
> > 'POLICY_NAME'}
> > alter 'TABLE_NAME', {NAME=>'CF_NAME', METADATA => 
> > {'hbase.hstore.block.storage.policy' => 'POLICY_NAME'}}
> HDFS's setStoragePolicy can only take effect when new hfile is created in a 
> configured directory, so I had to make sub directories(for each cf) in 
> region's .tmp directory and set storage policy for them.
> Besides, I had to upgrade hadoop version to 2.6.0 because 
> dfs.getStoragePolicy cannot be easily written in reflection, and I needed 
> this api to finish my unit test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17443) Move listReplicated/enableTableRep/disableTableRep methods from ReplicationAdmin to Admin

2017-01-10 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17443:
---
Attachment: HBASE-17443-v2.patch

Update a hbase-server ut to trigger Hadoop QA ut.

> Move listReplicated/enableTableRep/disableTableRep methods from 
> ReplicationAdmin to Admin
> -
>
> Key: HBASE-17443
> URL: https://issues.apache.org/jira/browse/HBASE-17443
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17443-v1.patch, HBASE-17443-v2.patch
>
>
> We have moved other replication requests to Admin and mark ReplicationAdmin 
> as Deprecated, so listReplicated/enableTableRep/disableTableRep methods need 
> move to Admin, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17442) Move most of the replication related classes to hbase-server package

2017-01-12 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821053#comment-15821053
 ] 

Guanghao Zhang commented on HBASE-17442:


[~stack] I agree with this. Thanks for your help. :)

[~enis] It's really good to know your thoughts about this.

> Move most of the replication related classes to hbase-server package
> 
>
> Key: HBASE-17442
> URL: https://issues.apache.org/jira/browse/HBASE-17442
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, Replication
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
>
> After the replication requests are routed through master, replication 
> implementation details didn't need be exposed to client. We should move most 
> of the replication related classes to hbase-server package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17443) Move listReplicated/enableTableRep/disableTableRep methods from ReplicationAdmin to Admin

2017-01-12 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821060#comment-15821060
 ] 

Guanghao Zhang commented on HBASE-17443:


TestPartitionedMobCompactor has been resloved by HBASE-14061.

> Move listReplicated/enableTableRep/disableTableRep methods from 
> ReplicationAdmin to Admin
> -
>
> Key: HBASE-17443
> URL: https://issues.apache.org/jira/browse/HBASE-17443
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17443-v1.patch, HBASE-17443-v2.patch, 
> HBASE-17443-v2.patch
>
>
> We have moved other replication requests to Admin and mark ReplicationAdmin 
> as Deprecated, so listReplicated/enableTableRep/disableTableRep methods need 
> move to Admin, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17443) Move listReplicated/enableTableRep/disableTableRep methods from ReplicationAdmin to Admin

2017-01-13 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17443:
---
Attachment: HBASE-17443-v3.patch

> Move listReplicated/enableTableRep/disableTableRep methods from 
> ReplicationAdmin to Admin
> -
>
> Key: HBASE-17443
> URL: https://issues.apache.org/jira/browse/HBASE-17443
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17443-v1.patch, HBASE-17443-v2.patch, 
> HBASE-17443-v2.patch, HBASE-17443-v3.patch
>
>
> We have moved other replication requests to Admin and mark ReplicationAdmin 
> as Deprecated, so listReplicated/enableTableRep/disableTableRep methods need 
> move to Admin, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17396) Add first async admin impl and implement balance methods

2017-01-09 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17396:
---
Attachment: HBASE-17396-v5.patch

> Add first async admin impl and implement balance methods
> 
>
> Key: HBASE-17396
> URL: https://issues.apache.org/jira/browse/HBASE-17396
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17396-v1.patch, HBASE-17396-v2.patch, 
> HBASE-17396-v3.patch, HBASE-17396-v4.patch, HBASE-17396-v5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14061) Support CF-level Storage Policy

2017-01-11 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817922#comment-15817922
 ] 

Guanghao Zhang commented on HBASE-14061:


Test 2nd addendum patch locally and TestPartitionedMobCompactor passed. +1.

> Support CF-level Storage Policy
> ---
>
> Key: HBASE-14061
> URL: https://issues.apache.org/jira/browse/HBASE-14061
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, regionserver
> Environment: hadoop-2.6.0
>Reporter: Victor Xu
>Assignee: Yu Li
> Fix For: 2.0.0
>
> Attachments: HBASE-14061-master-v1.patch, HBASE-14061.addendum.patch, 
> HBASE-14061.addendum.patch, HBASE-14061.addendum2.patch, 
> HBASE-14061.addendum2.patch, HBASE-14061.v2.patch, HBASE-14061.v3.patch, 
> HBASE-14061.v4.patch
>
>
> After reading [HBASE-12848|https://issues.apache.org/jira/browse/HBASE-12848] 
> and [HBASE-12934|https://issues.apache.org/jira/browse/HBASE-12934], I wrote 
> a patch to implement cf-level storage policy. 
> My main purpose is to improve random-read performance for some really hot 
> data, which usually locates in certain column family of a big table.
> Usage:
> $ hbase shell
> > alter 'TABLE_NAME', METADATA => {'hbase.hstore.block.storage.policy' => 
> > 'POLICY_NAME'}
> > alter 'TABLE_NAME', {NAME=>'CF_NAME', METADATA => 
> > {'hbase.hstore.block.storage.policy' => 'POLICY_NAME'}}
> HDFS's setStoragePolicy can only take effect when new hfile is created in a 
> configured directory, so I had to make sub directories(for each cf) in 
> region's .tmp directory and set storage policy for them.
> Besides, I had to upgrade hadoop version to 2.6.0 because 
> dfs.getStoragePolicy cannot be easily written in reflection, and I needed 
> this api to finish my unit test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17205) Add a metric for the duration of region in transition

2016-12-01 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713594#comment-15713594
 ] 

Guanghao Zhang commented on HBASE-17205:


Thanks [~mbertozzi] for reviewing.

> Add a metric for the duration of region in transition
> -
>
> Key: HBASE-17205
> URL: https://issues.apache.org/jira/browse/HBASE-17205
> Project: HBase
>  Issue Type: Improvement
>  Components: Region Assignment
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17205-branch-1.patch, HBASE-17205-v1.patch, 
> HBASE-17205-v1.patch, HBASE-17205.patch
>
>
> When work for HBASE-17178, I found there are not a metric for the overall 
> duration of region in transition. When move a region form A to B, the 
> transformation of region state is PENDING_CLOSE => CLOSING => CLOSED => 
> PENDING_OPEN => OPENING => OPENED. When transform old region state to new 
> region state, it update the time stamp to current time. So we can't get the 
> overall transformation's duration of region in transition. Add a rit duration 
> to RegionState for accumulating this metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17140) Reduce meta request number by skipping table state check

2016-11-29 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707486#comment-15707486
 ] 

Guanghao Zhang commented on HBASE-17140:


Thanks for your reply.

bq. In the case of (2), the HRI for parent region is saved with split=true, 
offline=true (similar for merge).
If I am not wrong, when merge A and B to a new region, the region info of A and 
B are deleted directly? So split=true, offline=true means a split parent 
region. And offline=true means a region of disabled table.

bq. When the table is re-enabled again, we do not want to bring back the old 
parents.
When enable a table, it need to get the table regions first and the split 
parent region will be filtered in this step. So I thought it can't bring back?


> Reduce meta request number by skipping table state check
> 
>
> Key: HBASE-17140
> URL: https://issues.apache.org/jira/browse/HBASE-17140
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-17140-v1.patch, HBASE-17140-v2.patch, 
> HBASE-17140-v3.patch, HBASE-17140-v4.patch, HBASE-17140-v5.patch
>
>
> Now when request for a disabled table, it need 3 rpc calls before fail.
> 1. get region location
> 2. send call to rs and get NotServeRegionException
> 3. retry and check the table state, then throw TableNotEnabledException
> The table state check is added for disabled table. But now the prepare method 
> in RegionServerCallable shows that all retry request will get table state 
> first.
> {code}
> public void prepare(final boolean reload) throws IOException {
> // check table state if this is a retry
> if (reload && !tableName.equals(TableName.META_TABLE_NAME) &&
> getConnection().isTableDisabled(tableName)) {
>   throw new TableNotEnabledException(tableName.getNameAsString() + " is 
> disabled.");
> }
> try (RegionLocator regionLocator = 
> connection.getRegionLocator(tableName)) {
>   this.location = regionLocator.getRegionLocation(row);
> }
> if (this.location == null) {
>   throw new IOException("Failed to find location, tableName=" + tableName 
> +
>   ", row=" + Bytes.toString(row) + ", reload=" + reload);
> }
> setStubByServiceName(this.location.getServerName());
> }
> {code}
> An improvement is set the region offline in HRegionInfo and throw the 
> RegionOfflineException when get region location. Then we don't need check 
> table state for any retry request.
> Review board: https://reviews.apache.org/r/54071/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17178) Add region balance throttling

2016-11-29 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17178:
---
Attachment: HBASE-17178-v5.patch

Attach a v5 patch addressed review comments.

> Add region balance throttling
> -
>
> Key: HBASE-17178
> URL: https://issues.apache.org/jira/browse/HBASE-17178
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-17178-v1.patch, HBASE-17178-v2.patch, 
> HBASE-17178-v3.patch, HBASE-17178-v4.patch, HBASE-17178-v5.patch
>
>
> Our online cluster serves dozens of  tables and different tables serve for 
> different services. If the balancer moves too many regions in the same time, 
> it will decrease the availability for some table or some services. So we add 
> region balance throttling on our online serve cluster. 
> We introduce a new config hbase.balancer.max.balancing.regions, which means 
> the max number of regions in transition when balancing.
> If we config this to 1 and a table have 100 regions, then the table will have 
> 99 regions available at any time. It helps a lot for our use case and it has 
> been running a long time
> our production cluster.
> But for some use case, we need the balancer run faster. If a cluster has 100 
> regionservers, then it add 50 new regionservers for peak requests. Then it 
> need balancer run as soon as
> possible and let the cluster reach a balance state soon. Our idea is compute 
> max number of regions in transition by the max balancing time and the average 
> time of region in transition.
> Then the balancer use the computed value to throttling.
> Examples for understanding.
> A cluster has 100 regionservers, each regionserver has 200 regions and the 
> average time of region in transition is 1 seconds, we config the max 
> balancing time is 10 * 60 seconds.
> Case 1. One regionserver crash, the cluster at most need balance 200 regions. 
> Then 200 / (10 * 60s / 1s) < 1, it means the max number of regions in 
> transition is 1 when balancing. Then the balancer can move region one by one 
> and the cluster will have high availability  when balancing.
> Case 2. Add other 100 regionservers, the cluster at most need balance 1 
> regions. Then 1 / (10 * 60s / 1s) = 16.7, it means the max number of 
> regions in transition is 17 when balancing. Then the cluster can reach a 
> balance state within the max balancing time.
> Any suggestions are welcomed.
> Review board: https://reviews.apache.org/r/54191/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17178) Add region balance throttling

2016-11-29 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17178:
---
Attachment: HBASE-17178-branch-1.patch

Attach patch for branch-1.

> Add region balance throttling
> -
>
> Key: HBASE-17178
> URL: https://issues.apache.org/jira/browse/HBASE-17178
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-17178-branch-1.patch, HBASE-17178-v1.patch, 
> HBASE-17178-v2.patch, HBASE-17178-v3.patch, HBASE-17178-v4.patch, 
> HBASE-17178-v5.patch
>
>
> Our online cluster serves dozens of  tables and different tables serve for 
> different services. If the balancer moves too many regions in the same time, 
> it will decrease the availability for some table or some services. So we add 
> region balance throttling on our online serve cluster. 
> We introduce a new config hbase.balancer.max.balancing.regions, which means 
> the max number of regions in transition when balancing.
> If we config this to 1 and a table have 100 regions, then the table will have 
> 99 regions available at any time. It helps a lot for our use case and it has 
> been running a long time
> our production cluster.
> But for some use case, we need the balancer run faster. If a cluster has 100 
> regionservers, then it add 50 new regionservers for peak requests. Then it 
> need balancer run as soon as
> possible and let the cluster reach a balance state soon. Our idea is compute 
> max number of regions in transition by the max balancing time and the average 
> time of region in transition.
> Then the balancer use the computed value to throttling.
> Examples for understanding.
> A cluster has 100 regionservers, each regionserver has 200 regions and the 
> average time of region in transition is 1 seconds, we config the max 
> balancing time is 10 * 60 seconds.
> Case 1. One regionserver crash, the cluster at most need balance 200 regions. 
> Then 200 / (10 * 60s / 1s) < 1, it means the max number of regions in 
> transition is 1 when balancing. Then the balancer can move region one by one 
> and the cluster will have high availability  when balancing.
> Case 2. Add other 100 regionservers, the cluster at most need balance 1 
> regions. Then 1 / (10 * 60s / 1s) = 16.7, it means the max number of 
> regions in transition is 17 when balancing. Then the cluster can reach a 
> balance state within the max balancing time.
> Any suggestions are welcomed.
> Review board: https://reviews.apache.org/r/54191/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17178) Add region balance throttling

2016-11-29 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17178:
---
Affects Version/s: 1.4.0
   2.0.0

> Add region balance throttling
> -
>
> Key: HBASE-17178
> URL: https://issues.apache.org/jira/browse/HBASE-17178
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-17178-branch-1.patch, HBASE-17178-v1.patch, 
> HBASE-17178-v2.patch, HBASE-17178-v3.patch, HBASE-17178-v4.patch, 
> HBASE-17178-v5.patch
>
>
> Our online cluster serves dozens of  tables and different tables serve for 
> different services. If the balancer moves too many regions in the same time, 
> it will decrease the availability for some table or some services. So we add 
> region balance throttling on our online serve cluster. 
> We introduce a new config hbase.balancer.max.balancing.regions, which means 
> the max number of regions in transition when balancing.
> If we config this to 1 and a table have 100 regions, then the table will have 
> 99 regions available at any time. It helps a lot for our use case and it has 
> been running a long time
> our production cluster.
> But for some use case, we need the balancer run faster. If a cluster has 100 
> regionservers, then it add 50 new regionservers for peak requests. Then it 
> need balancer run as soon as
> possible and let the cluster reach a balance state soon. Our idea is compute 
> max number of regions in transition by the max balancing time and the average 
> time of region in transition.
> Then the balancer use the computed value to throttling.
> Examples for understanding.
> A cluster has 100 regionservers, each regionserver has 200 regions and the 
> average time of region in transition is 1 seconds, we config the max 
> balancing time is 10 * 60 seconds.
> Case 1. One regionserver crash, the cluster at most need balance 200 regions. 
> Then 200 / (10 * 60s / 1s) < 1, it means the max number of regions in 
> transition is 1 when balancing. Then the balancer can move region one by one 
> and the cluster will have high availability  when balancing.
> Case 2. Add other 100 regionservers, the cluster at most need balance 1 
> regions. Then 1 / (10 * 60s / 1s) = 16.7, it means the max number of 
> regions in transition is 17 when balancing. Then the cluster can reach a 
> balance state within the max balancing time.
> Any suggestions are welcomed.
> Review board: https://reviews.apache.org/r/54191/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17140) Reduce meta request number by skipping table state check

2016-11-29 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17140:
---
Summary: Reduce meta request number by skipping table state check  (was: 
Throw RegionOfflineException directly when request for a disabled table)

> Reduce meta request number by skipping table state check
> 
>
> Key: HBASE-17140
> URL: https://issues.apache.org/jira/browse/HBASE-17140
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-17140-v1.patch, HBASE-17140-v2.patch, 
> HBASE-17140-v3.patch, HBASE-17140-v4.patch, HBASE-17140-v5.patch
>
>
> Now when request for a disabled table, it need 3 rpc calls before fail.
> 1. get region location
> 2. send call to rs and get NotServeRegionException
> 3. retry and check the table state, then throw TableNotEnabledException
> The table state check is added for disabled table. But now the prepare method 
> in RegionServerCallable shows that all retry request will get table state 
> first.
> {code}
> public void prepare(final boolean reload) throws IOException {
> // check table state if this is a retry
> if (reload && !tableName.equals(TableName.META_TABLE_NAME) &&
> getConnection().isTableDisabled(tableName)) {
>   throw new TableNotEnabledException(tableName.getNameAsString() + " is 
> disabled.");
> }
> try (RegionLocator regionLocator = 
> connection.getRegionLocator(tableName)) {
>   this.location = regionLocator.getRegionLocation(row);
> }
> if (this.location == null) {
>   throw new IOException("Failed to find location, tableName=" + tableName 
> +
>   ", row=" + Bytes.toString(row) + ", reload=" + reload);
> }
> setStubByServiceName(this.location.getServerName());
> }
> {code}
> An improvement is set the region offline in HRegionInfo. Then throw the 
> RegionOfflineException when get region location.
> Review board: https://reviews.apache.org/r/54071/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17178) Add region balance throttling

2016-11-29 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17178:
---
Description: 
Our online cluster serves dozens of  tables and different tables serve for 
different services. If the balancer moves too many regions in the same time, 
it will decrease the availability for some table or some services. So we add 
region balance throttling on our online serve cluster. 
We introduce a new config hbase.balancer.max.balancing.regions, which means the 
max number of regions in transition when balancing.
If we config this to 1 and a table have 100 regions, then the table will have 
99 regions available at any time. It helps a lot for our use case and it has 
been running a long time
our production cluster.

But for some use case, we need the balancer run faster. If a cluster has 100 
regionservers, then it add 50 new regionservers for peak requests. Then it need 
balancer run as soon as
possible and let the cluster reach a balance state soon. Our idea is compute 
max number of regions in transition by the max balancing time and the average 
time of region in transition.
Then the balancer use the computed value to throttling.

Examples for understanding.
A cluster has 100 regionservers, each regionserver has 200 regions and the 
average time of region in transition is 1 seconds, we config the max balancing 
time is 10 * 60 seconds.
Case 1. One regionserver crash, the cluster at most need balance 200 regions. 
Then 200 / (10 * 60s / 1s) < 1, it means the max number of regions in 
transition is 1 when balancing. Then the balancer can move region one by one 
and the cluster will have high availability  when balancing.
Case 2. Add other 100 regionservers, the cluster at most need balance 1 
regions. Then 1 / (10 * 60s / 1s) = 16.7, it means the max number of 
regions in transition is 17 when balancing. Then the cluster can reach a 
balance state within the max balancing time.

Any suggestions are welcomed.

Review board: https://reviews.apache.org/r/54191/

  was:
Our online cluster serves dozens of  tables and different tables serve for 
different services. If the balancer moves too many regions in the same time, 
it will decrease the availability for some table or some services. So we add 
region balance throttling on our online serve cluster. 
We introduce a new config hbase.balancer.max.balancing.regions, which means the 
max number of regions in transition when balancing.
If we config this to 1 and a table have 100 regions, then the table will have 
99 regions available at any time. It helps a lot for our use case and it has 
been running a long time
our production cluster.

But for some use case, we need the balancer run faster. If a cluster has 100 
regionservers, then it add 50 new regionservers for peak requests. Then it need 
balancer run as soon as
possible and let the cluster reach a balance state soon. Our idea is compute 
max number of regions in transition by the max balancing time and the average 
time of region in transition.
Then the balancer use the computed value to throttling.

Examples for understanding.
A cluster has 100 regionservers, each regionserver has 200 regions and the 
average time of region in transition is 1 seconds, we config the max balancing 
time is 10 * 60 seconds.
Case 1. One regionserver crash, the cluster at most need balance 200 regions. 
Then 200 / (10 * 60s / 1s) < 1, it means the max number of regions in 
transition is 1 when balancing. Then the balancer can move region one by one 
and the cluster will have high availability  when balancing.
Case 2. Add other 100 regionservers, the cluster at most need balance 1 
regions. Then 1 / (10 * 60s / 1s) = 16.7, it means the max number of 
regions in transition is 17 when balancing. Then the cluster can reach a 
balance state within the max balancing time.

Any suggestions are welcomed.


> Add region balance throttling
> -
>
> Key: HBASE-17178
> URL: https://issues.apache.org/jira/browse/HBASE-17178
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-17178-v1.patch, HBASE-17178-v2.patch, 
> HBASE-17178-v3.patch, HBASE-17178-v4.patch
>
>
> Our online cluster serves dozens of  tables and different tables serve for 
> different services. If the balancer moves too many regions in the same time, 
> it will decrease the availability for some table or some services. So we add 
> region balance throttling on our online serve cluster. 
> We introduce a new config hbase.balancer.max.balancing.regions, which means 
> the max number of regions in transition when balancing.
> If we config this to 1 and a table have 100 regions, then the table will have 
> 99 regions available at any 

[jira] [Updated] (HBASE-17178) Add region balance throttling

2016-11-29 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17178:
---
Attachment: HBASE-17178-v4.patch

> Add region balance throttling
> -
>
> Key: HBASE-17178
> URL: https://issues.apache.org/jira/browse/HBASE-17178
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-17178-v1.patch, HBASE-17178-v2.patch, 
> HBASE-17178-v3.patch, HBASE-17178-v4.patch
>
>
> Our online cluster serves dozens of  tables and different tables serve for 
> different services. If the balancer moves too many regions in the same time, 
> it will decrease the availability for some table or some services. So we add 
> region balance throttling on our online serve cluster. 
> We introduce a new config hbase.balancer.max.balancing.regions, which means 
> the max number of regions in transition when balancing.
> If we config this to 1 and a table have 100 regions, then the table will have 
> 99 regions available at any time. It helps a lot for our use case and it has 
> been running a long time
> our production cluster.
> But for some use case, we need the balancer run faster. If a cluster has 100 
> regionservers, then it add 50 new regionservers for peak requests. Then it 
> need balancer run as soon as
> possible and let the cluster reach a balance state soon. Our idea is compute 
> max number of regions in transition by the max balancing time and the average 
> time of region in transition.
> Then the balancer use the computed value to throttling.
> Examples for understanding.
> A cluster has 100 regionservers, each regionserver has 200 regions and the 
> average time of region in transition is 1 seconds, we config the max 
> balancing time is 10 * 60 seconds.
> Case 1. One regionserver crash, the cluster at most need balance 200 regions. 
> Then 200 / (10 * 60s / 1s) < 1, it means the max number of regions in 
> transition is 1 when balancing. Then the balancer can move region one by one 
> and the cluster will have high availability  when balancing.
> Case 2. Add other 100 regionservers, the cluster at most need balance 1 
> regions. Then 1 / (10 * 60s / 1s) = 16.7, it means the max number of 
> regions in transition is 17 when balancing. Then the cluster can reach a 
> balance state within the max balancing time.
> Any suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17178) Add region balance throttling

2016-11-29 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707188#comment-15707188
 ] 

Guanghao Zhang commented on HBASE-17178:


Review board: https://reviews.apache.org/r/54191/

bq. Move this line out of synchronized
Fixed in v4 patch.

bq. Shall the balancing be affected by other RIT? Assuming RS crash happened in 
middle of balancing, shall we wait?
Yes, balancing will be affected by other RIT. This is for availability. If RS 
crash happend in middle of balancing, there will be more regions in transition. 
Then the balancer can't finish all region plans. The cluster need a next round 
balance to reach a balance state.

bq.  the code flow of balancer might block here and not controlled by the 
cutoffTime?
Fixed in v4 patch. It need break the sleep when exceeds cutoff time.

Review board: https://reviews.apache.org/r/54191/



> Add region balance throttling
> -
>
> Key: HBASE-17178
> URL: https://issues.apache.org/jira/browse/HBASE-17178
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-17178-v1.patch, HBASE-17178-v2.patch, 
> HBASE-17178-v3.patch, HBASE-17178-v4.patch
>
>
> Our online cluster serves dozens of  tables and different tables serve for 
> different services. If the balancer moves too many regions in the same time, 
> it will decrease the availability for some table or some services. So we add 
> region balance throttling on our online serve cluster. 
> We introduce a new config hbase.balancer.max.balancing.regions, which means 
> the max number of regions in transition when balancing.
> If we config this to 1 and a table have 100 regions, then the table will have 
> 99 regions available at any time. It helps a lot for our use case and it has 
> been running a long time
> our production cluster.
> But for some use case, we need the balancer run faster. If a cluster has 100 
> regionservers, then it add 50 new regionservers for peak requests. Then it 
> need balancer run as soon as
> possible and let the cluster reach a balance state soon. Our idea is compute 
> max number of regions in transition by the max balancing time and the average 
> time of region in transition.
> Then the balancer use the computed value to throttling.
> Examples for understanding.
> A cluster has 100 regionservers, each regionserver has 200 regions and the 
> average time of region in transition is 1 seconds, we config the max 
> balancing time is 10 * 60 seconds.
> Case 1. One regionserver crash, the cluster at most need balance 200 regions. 
> Then 200 / (10 * 60s / 1s) < 1, it means the max number of regions in 
> transition is 1 when balancing. Then the balancer can move region one by one 
> and the cluster will have high availability  when balancing.
> Case 2. Add other 100 regionservers, the cluster at most need balance 1 
> regions. Then 1 / (10 * 60s / 1s) = 16.7, it means the max number of 
> regions in transition is 17 when balancing. Then the cluster can reach a 
> balance state within the max balancing time.
> Any suggestions are welcomed.
> Review board: https://reviews.apache.org/r/54191/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-17178) Add region balance throttling

2016-11-29 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707188#comment-15707188
 ] 

Guanghao Zhang edited comment on HBASE-17178 at 11/30/16 1:38 AM:
--

Review board: https://reviews.apache.org/r/54191/

bq. Move this line out of synchronized
Fixed in v4 patch.

bq. Shall the balancing be affected by other RIT? Assuming RS crash happened in 
middle of balancing, shall we wait?
Yes, balancing will be affected by other RIT. This is for availability. If RS 
crash happend in middle of balancing, there will be more regions in transition. 
Then the balancer can't finish all region plans. The cluster need a next round 
balance to reach a balance state.

bq.  the code flow of balancer might block here and not controlled by the 
cutoffTime?
Fixed in v4 patch. It need break the sleep when exceeds cutoff time.





was (Author: zghaobac):
Review board: https://reviews.apache.org/r/54191/

bq. Move this line out of synchronized
Fixed in v4 patch.

bq. Shall the balancing be affected by other RIT? Assuming RS crash happened in 
middle of balancing, shall we wait?
Yes, balancing will be affected by other RIT. This is for availability. If RS 
crash happend in middle of balancing, there will be more regions in transition. 
Then the balancer can't finish all region plans. The cluster need a next round 
balance to reach a balance state.

bq.  the code flow of balancer might block here and not controlled by the 
cutoffTime?
Fixed in v4 patch. It need break the sleep when exceeds cutoff time.

Review board: https://reviews.apache.org/r/54191/



> Add region balance throttling
> -
>
> Key: HBASE-17178
> URL: https://issues.apache.org/jira/browse/HBASE-17178
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-17178-v1.patch, HBASE-17178-v2.patch, 
> HBASE-17178-v3.patch, HBASE-17178-v4.patch
>
>
> Our online cluster serves dozens of  tables and different tables serve for 
> different services. If the balancer moves too many regions in the same time, 
> it will decrease the availability for some table or some services. So we add 
> region balance throttling on our online serve cluster. 
> We introduce a new config hbase.balancer.max.balancing.regions, which means 
> the max number of regions in transition when balancing.
> If we config this to 1 and a table have 100 regions, then the table will have 
> 99 regions available at any time. It helps a lot for our use case and it has 
> been running a long time
> our production cluster.
> But for some use case, we need the balancer run faster. If a cluster has 100 
> regionservers, then it add 50 new regionservers for peak requests. Then it 
> need balancer run as soon as
> possible and let the cluster reach a balance state soon. Our idea is compute 
> max number of regions in transition by the max balancing time and the average 
> time of region in transition.
> Then the balancer use the computed value to throttling.
> Examples for understanding.
> A cluster has 100 regionservers, each regionserver has 200 regions and the 
> average time of region in transition is 1 seconds, we config the max 
> balancing time is 10 * 60 seconds.
> Case 1. One regionserver crash, the cluster at most need balance 200 regions. 
> Then 200 / (10 * 60s / 1s) < 1, it means the max number of regions in 
> transition is 1 when balancing. Then the balancer can move region one by one 
> and the cluster will have high availability  when balancing.
> Case 2. Add other 100 regionservers, the cluster at most need balance 1 
> regions. Then 1 / (10 * 60s / 1s) = 16.7, it means the max number of 
> regions in transition is 17 when balancing. Then the cluster can reach a 
> balance state within the max balancing time.
> Any suggestions are welcomed.
> Review board: https://reviews.apache.org/r/54191/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17140) Reduce meta request number by skipping table state check

2016-11-29 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17140:
---
Affects Version/s: 2.0.0
  Description: 
Now when request for a disabled table, it need 3 rpc calls before fail.
1. get region location
2. send call to rs and get NotServeRegionException
3. retry and check the table state, then throw TableNotEnabledException

The table state check is added for disabled table. But now the prepare method 
in RegionServerCallable shows that all retry request will get table state first.
{code}
public void prepare(final boolean reload) throws IOException {
// check table state if this is a retry
if (reload && !tableName.equals(TableName.META_TABLE_NAME) &&
getConnection().isTableDisabled(tableName)) {
  throw new TableNotEnabledException(tableName.getNameAsString() + " is 
disabled.");
}
try (RegionLocator regionLocator = connection.getRegionLocator(tableName)) {
  this.location = regionLocator.getRegionLocation(row);
}
if (this.location == null) {
  throw new IOException("Failed to find location, tableName=" + tableName +
  ", row=" + Bytes.toString(row) + ", reload=" + reload);
}
setStubByServiceName(this.location.getServerName());
}
{code}

An improvement is set the region offline in HRegionInfo and throw the 
RegionOfflineException when get region location. Then we don't need check table 
state for any retry request.

Review board: https://reviews.apache.org/r/54071/

  was:
Now when request for a disabled table, it need 3 rpc calls before fail.
1. get region location
2. send call to rs and get NotServeRegionException
3. retry and check the table state, then throw TableNotEnabledException

The table state check is added for disabled table. But now the prepare method 
in RegionServerCallable shows that all retry request will get table state first.
{code}
public void prepare(final boolean reload) throws IOException {
// check table state if this is a retry
if (reload && !tableName.equals(TableName.META_TABLE_NAME) &&
getConnection().isTableDisabled(tableName)) {
  throw new TableNotEnabledException(tableName.getNameAsString() + " is 
disabled.");
}
try (RegionLocator regionLocator = connection.getRegionLocator(tableName)) {
  this.location = regionLocator.getRegionLocation(row);
}
if (this.location == null) {
  throw new IOException("Failed to find location, tableName=" + tableName +
  ", row=" + Bytes.toString(row) + ", reload=" + reload);
}
setStubByServiceName(this.location.getServerName());
}
{code}

An improvement is set the region offline in HRegionInfo. Then throw the 
RegionOfflineException when get region location.

Review board: https://reviews.apache.org/r/54071/


> Reduce meta request number by skipping table state check
> 
>
> Key: HBASE-17140
> URL: https://issues.apache.org/jira/browse/HBASE-17140
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-17140-v1.patch, HBASE-17140-v2.patch, 
> HBASE-17140-v3.patch, HBASE-17140-v4.patch, HBASE-17140-v5.patch
>
>
> Now when request for a disabled table, it need 3 rpc calls before fail.
> 1. get region location
> 2. send call to rs and get NotServeRegionException
> 3. retry and check the table state, then throw TableNotEnabledException
> The table state check is added for disabled table. But now the prepare method 
> in RegionServerCallable shows that all retry request will get table state 
> first.
> {code}
> public void prepare(final boolean reload) throws IOException {
> // check table state if this is a retry
> if (reload && !tableName.equals(TableName.META_TABLE_NAME) &&
> getConnection().isTableDisabled(tableName)) {
>   throw new TableNotEnabledException(tableName.getNameAsString() + " is 
> disabled.");
> }
> try (RegionLocator regionLocator = 
> connection.getRegionLocator(tableName)) {
>   this.location = regionLocator.getRegionLocation(row);
> }
> if (this.location == null) {
>   throw new IOException("Failed to find location, tableName=" + tableName 
> +
>   ", row=" + Bytes.toString(row) + ", reload=" + reload);
> }
> setStubByServiceName(this.location.getServerName());
> }
> {code}
> An improvement is set the region offline in HRegionInfo and throw the 
> RegionOfflineException when get region location. Then we don't need check 
> table state for any retry request.
> Review board: https://reviews.apache.org/r/54071/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17178) Add region balance throttling

2016-11-29 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17178:
---
Release Note: Add region balance throttling. Master execute every region 
balance plan per balance interval, which is equals to divide max balancing time 
by the size of region balance plan. And Introduce a new config 
hbase.master.balancer.maxRitPercent to protect availability. If config this to 
0.01, then the max percent of regions in transition is 1% when balancing. Then 
the cluster's availability is at least 99% when balancing.

> Add region balance throttling
> -
>
> Key: HBASE-17178
> URL: https://issues.apache.org/jira/browse/HBASE-17178
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-17178-v1.patch, HBASE-17178-v2.patch, 
> HBASE-17178-v3.patch
>
>
> Our online cluster serves dozens of  tables and different tables serve for 
> different services. If the balancer moves too many regions in the same time, 
> it will decrease the availability for some table or some services. So we add 
> region balance throttling on our online serve cluster. 
> We introduce a new config hbase.balancer.max.balancing.regions, which means 
> the max number of regions in transition when balancing.
> If we config this to 1 and a table have 100 regions, then the table will have 
> 99 regions available at any time. It helps a lot for our use case and it has 
> been running a long time
> our production cluster.
> But for some use case, we need the balancer run faster. If a cluster has 100 
> regionservers, then it add 50 new regionservers for peak requests. Then it 
> need balancer run as soon as
> possible and let the cluster reach a balance state soon. Our idea is compute 
> max number of regions in transition by the max balancing time and the average 
> time of region in transition.
> Then the balancer use the computed value to throttling.
> Examples for understanding.
> A cluster has 100 regionservers, each regionserver has 200 regions and the 
> average time of region in transition is 1 seconds, we config the max 
> balancing time is 10 * 60 seconds.
> Case 1. One regionserver crash, the cluster at most need balance 200 regions. 
> Then 200 / (10 * 60s / 1s) < 1, it means the max number of regions in 
> transition is 1 when balancing. Then the balancer can move region one by one 
> and the cluster will have high availability  when balancing.
> Case 2. Add other 100 regionservers, the cluster at most need balance 1 
> regions. Then 1 / (10 * 60s / 1s) = 16.7, it means the max number of 
> regions in transition is 17 when balancing. Then the cluster can reach a 
> balance state within the max balancing time.
> Any suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17178) Add region balance throttling

2016-11-29 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17178:
---
Attachment: HBASE-17178-v6.patch

> Add region balance throttling
> -
>
> Key: HBASE-17178
> URL: https://issues.apache.org/jira/browse/HBASE-17178
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-17178-branch-1.patch, HBASE-17178-v1.patch, 
> HBASE-17178-v2.patch, HBASE-17178-v3.patch, HBASE-17178-v4.patch, 
> HBASE-17178-v5.patch, HBASE-17178-v6.patch
>
>
> Our online cluster serves dozens of  tables and different tables serve for 
> different services. If the balancer moves too many regions in the same time, 
> it will decrease the availability for some table or some services. So we add 
> region balance throttling on our online serve cluster. 
> We introduce a new config hbase.balancer.max.balancing.regions, which means 
> the max number of regions in transition when balancing.
> If we config this to 1 and a table have 100 regions, then the table will have 
> 99 regions available at any time. It helps a lot for our use case and it has 
> been running a long time
> our production cluster.
> But for some use case, we need the balancer run faster. If a cluster has 100 
> regionservers, then it add 50 new regionservers for peak requests. Then it 
> need balancer run as soon as
> possible and let the cluster reach a balance state soon. Our idea is compute 
> max number of regions in transition by the max balancing time and the average 
> time of region in transition.
> Then the balancer use the computed value to throttling.
> Examples for understanding.
> A cluster has 100 regionservers, each regionserver has 200 regions and the 
> average time of region in transition is 1 seconds, we config the max 
> balancing time is 10 * 60 seconds.
> Case 1. One regionserver crash, the cluster at most need balance 200 regions. 
> Then 200 / (10 * 60s / 1s) < 1, it means the max number of regions in 
> transition is 1 when balancing. Then the balancer can move region one by one 
> and the cluster will have high availability  when balancing.
> Case 2. Add other 100 regionservers, the cluster at most need balance 1 
> regions. Then 1 / (10 * 60s / 1s) = 16.7, it means the max number of 
> regions in transition is 17 when balancing. Then the cluster can reach a 
> balance state within the max balancing time.
> Any suggestions are welcomed.
> Review board: https://reviews.apache.org/r/54191/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17178) Add region balance throttling

2016-11-30 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17178:
---
Attachment: HBASE-17178-branch-1-v1.patch

Update patch for branch-1.

> Add region balance throttling
> -
>
> Key: HBASE-17178
> URL: https://issues.apache.org/jira/browse/HBASE-17178
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-17178-branch-1-v1.patch, 
> HBASE-17178-branch-1.patch, HBASE-17178-v1.patch, HBASE-17178-v2.patch, 
> HBASE-17178-v3.patch, HBASE-17178-v4.patch, HBASE-17178-v5.patch, 
> HBASE-17178-v6.patch
>
>
> Our online cluster serves dozens of  tables and different tables serve for 
> different services. If the balancer moves too many regions in the same time, 
> it will decrease the availability for some table or some services. So we add 
> region balance throttling on our online serve cluster. 
> We introduce a new config hbase.balancer.max.balancing.regions, which means 
> the max number of regions in transition when balancing.
> If we config this to 1 and a table have 100 regions, then the table will have 
> 99 regions available at any time. It helps a lot for our use case and it has 
> been running a long time
> our production cluster.
> But for some use case, we need the balancer run faster. If a cluster has 100 
> regionservers, then it add 50 new regionservers for peak requests. Then it 
> need balancer run as soon as
> possible and let the cluster reach a balance state soon. Our idea is compute 
> max number of regions in transition by the max balancing time and the average 
> time of region in transition.
> Then the balancer use the computed value to throttling.
> Examples for understanding.
> A cluster has 100 regionservers, each regionserver has 200 regions and the 
> average time of region in transition is 1 seconds, we config the max 
> balancing time is 10 * 60 seconds.
> Case 1. One regionserver crash, the cluster at most need balance 200 regions. 
> Then 200 / (10 * 60s / 1s) < 1, it means the max number of regions in 
> transition is 1 when balancing. Then the balancer can move region one by one 
> and the cluster will have high availability  when balancing.
> Case 2. Add other 100 regionservers, the cluster at most need balance 1 
> regions. Then 1 / (10 * 60s / 1s) = 16.7, it means the max number of 
> regions in transition is 17 when balancing. Then the cluster can reach a 
> balance state within the max balancing time.
> Any suggestions are welcomed.
> Review board: https://reviews.apache.org/r/54191/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17205) Add a metric for the duration of region in transition

2016-11-30 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17205:
---
Status: Patch Available  (was: Open)

> Add a metric for the duration of region in transition
> -
>
> Key: HBASE-17205
> URL: https://issues.apache.org/jira/browse/HBASE-17205
> Project: HBase
>  Issue Type: Improvement
>  Components: Region Assignment
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Attachments: HBASE-17205.patch
>
>
> When work for HBASE-17178, I found there are not a metric for the overall 
> duration of region in transition. When move a region form A to B, the 
> transformation of region state is PENDING_CLOSE => CLOSING => CLOSED => 
> PENDING_OPEN => OPENING => OPENED. When transform old region state to new 
> region state, it update the time stamp to current time. So we can't get the 
> overall transformation's duration of region in transition. Add a rit duration 
> to RegionState for accumulating this metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17205) Add a metric for the duration of region in transition

2016-11-30 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17205:
---
Attachment: HBASE-17205.patch

> Add a metric for the duration of region in transition
> -
>
> Key: HBASE-17205
> URL: https://issues.apache.org/jira/browse/HBASE-17205
> Project: HBase
>  Issue Type: Improvement
>  Components: Region Assignment
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Attachments: HBASE-17205.patch
>
>
> When work for HBASE-17178, I found there are not a metric for the overall 
> duration of region in transition. When move a region form A to B, the 
> transformation of region state is PENDING_CLOSE => CLOSING => CLOSED => 
> PENDING_OPEN => OPENING => OPENED. When transform old region state to new 
> region state, it update the time stamp to current time. So we can't get the 
> overall transformation's duration of region in transition. Add a rit duration 
> to RegionState for accumulating this metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17178) Add region balance throttling

2016-11-30 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708146#comment-15708146
 ] 

Guanghao Zhang commented on HBASE-17178:


Thanks [~yangzhe1991] [~tedyu] [~carp84] [~ashish singhi] for reviewing.

> Add region balance throttling
> -
>
> Key: HBASE-17178
> URL: https://issues.apache.org/jira/browse/HBASE-17178
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17178-branch-1-v1.patch, 
> HBASE-17178-branch-1.patch, HBASE-17178-v1.patch, HBASE-17178-v2.patch, 
> HBASE-17178-v3.patch, HBASE-17178-v4.patch, HBASE-17178-v5.patch, 
> HBASE-17178-v6.patch
>
>
> Our online cluster serves dozens of  tables and different tables serve for 
> different services. If the balancer moves too many regions in the same time, 
> it will decrease the availability for some table or some services. So we add 
> region balance throttling on our online serve cluster. 
> We introduce a new config hbase.balancer.max.balancing.regions, which means 
> the max number of regions in transition when balancing.
> If we config this to 1 and a table have 100 regions, then the table will have 
> 99 regions available at any time. It helps a lot for our use case and it has 
> been running a long time
> our production cluster.
> But for some use case, we need the balancer run faster. If a cluster has 100 
> regionservers, then it add 50 new regionservers for peak requests. Then it 
> need balancer run as soon as
> possible and let the cluster reach a balance state soon. Our idea is compute 
> max number of regions in transition by the max balancing time and the average 
> time of region in transition.
> Then the balancer use the computed value to throttling.
> Examples for understanding.
> A cluster has 100 regionservers, each regionserver has 200 regions and the 
> average time of region in transition is 1 seconds, we config the max 
> balancing time is 10 * 60 seconds.
> Case 1. One regionserver crash, the cluster at most need balance 200 regions. 
> Then 200 / (10 * 60s / 1s) < 1, it means the max number of regions in 
> transition is 1 when balancing. Then the balancer can move region one by one 
> and the cluster will have high availability  when balancing.
> Case 2. Add other 100 regionservers, the cluster at most need balance 1 
> regions. Then 1 / (10 * 60s / 1s) = 16.7, it means the max number of 
> regions in transition is 17 when balancing. Then the cluster can reach a 
> balance state within the max balancing time.
> Any suggestions are welcomed.
> Review board: https://reviews.apache.org/r/54191/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17205) Add a metric for the duration of region in transition

2016-11-30 Thread Guanghao Zhang (JIRA)
Guanghao Zhang created HBASE-17205:
--

 Summary: Add a metric for the duration of region in transition
 Key: HBASE-17205
 URL: https://issues.apache.org/jira/browse/HBASE-17205
 Project: HBase
  Issue Type: Improvement
  Components: Region Assignment
Reporter: Guanghao Zhang
Assignee: Guanghao Zhang
Priority: Minor


When work for HBASE-17178, I found there are not a metric for the overall 
duration of region in transition. When move a region form A to B, the 
transformation of region state is PENDING_CLOSE => CLOSING => CLOSED => 
PENDING_OPEN => OPENING => OPENED. When transform old region state to new 
region state, it update the time stamp to current time. So we can't get the 
overall transformation's duration of region in transition. Add a rit duration 
to RegionState for accumulating this metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16336) Removing peers seem to be leaving spare queues

2016-12-05 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724132#comment-15724132
 ] 

Guanghao Zhang commented on HBASE-16336:


HBASE-12769 try to fix this by hbck. A more automatic way is to add a 
replication zk node checker on master. It periodically check and delete the 
useless replication zk node. In our use case, we found there are dead rs znode 
leaved  and the dead rs znode only can be transferred when other rs restarted. 
So the replication zk node checker should check the dead rs znode too. I know 
the more proper solution is  HBASE-11392 and HBASE-12439. But for branch-1, we 
can resolve this by a replication zk node checker. Any ideas? [~enis] 

> Removing peers seem to be leaving spare queues
> --
>
> Key: HBASE-16336
> URL: https://issues.apache.org/jira/browse/HBASE-16336
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Reporter: Joseph
>
> I have been running IntegrationTestReplication repeatedly with the backported 
> Replication Table changes. Every other iteration of the test fails with, but 
> these queues should have been deleted when we removed the peers. I believe 
> this may be related to HBASE-16096, HBASE-16208, or HBASE-16081.
> 16/08/02 08:36:07 ERROR util.AbstractHBaseTool: Error running command-line 
> tool
> org.apache.hadoop.hbase.replication.ReplicationException: undeleted queue for 
> peerId: TestPeer, replicator: 
> hbase4124.ash2.facebook.com,16020,1470150251042, queueId: TestPeer
>   at 
> org.apache.hadoop.hbase.replication.ReplicationPeersZKImpl.checkQueuesDeleted(ReplicationPeersZKImpl.java:544)
>   at 
> org.apache.hadoop.hbase.replication.ReplicationPeersZKImpl.addPeer(ReplicationPeersZKImpl.java:127)
>   at 
> org.apache.hadoop.hbase.client.replication.ReplicationAdmin.addPeer(ReplicationAdmin.java:200)
>   at 
> org.apache.hadoop.hbase.test.IntegrationTestReplication$VerifyReplicationLoop.setupTablesAndReplication(IntegrationTestReplication.java:239)
>   at 
> org.apache.hadoop.hbase.test.IntegrationTestReplication$VerifyReplicationLoop.run(IntegrationTestReplication.java:325)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at 
> org.apache.hadoop.hbase.test.IntegrationTestReplication.runTestFromCommandLine(IntegrationTestReplication.java:418)
>   at 
> org.apache.hadoop.hbase.IntegrationTestBase.doWork(IntegrationTestBase.java:134)
>   at 
> org.apache.hadoop.hbase.util.AbstractHBaseTool.run(AbstractHBaseTool.java:112)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at 
> org.apache.hadoop.hbase.test.IntegrationTestReplication.main(IntegrationTestReplication.java:424)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17261) Balancer makes no sense on tip of branch-1: says balanced when not

2016-12-05 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724002#comment-15724002
 ] 

Guanghao Zhang commented on HBASE-17261:


After HBASE-15529, cluster need balance when (total cost / sum multiplier) > 
minCostNeedBalance. So this means the average cost is less than the defaul 
minCostNeedBalance 0.05.

> Balancer makes no sense on tip of branch-1: says balanced when not
> --
>
> Key: HBASE-17261
> URL: https://issues.apache.org/jira/browse/HBASE-17261
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>
> Running ITBLL on tip of branch-1, I see this in log when I try to balance:
> {code}
> 2016-12-05 16:42:21,031 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=46,queue=1,port=16000] 
> balancer.StochasticLoadBalancer: Skipping load balancing because balanced 
> cluster; total cost is 525.2547686174673|
> , sum multiplier is 111087.0 min cost which need balance is 0.05
> {code}
> Its some old nonsense. 
> Does this every time I balance. Can't even force a balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17261) Balancer makes no sense on tip of branch-1: says balanced when not

2016-12-05 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724030#comment-15724030
 ] 

Guanghao Zhang commented on HBASE-17261:


We can update the default value of 
hbase.master.balancer.stochastic.minCostNeedBalance to 0.0. And keep the 
default behavior sames with before HBASE-15529. Any ideas? [~stack]

> Balancer makes no sense on tip of branch-1: says balanced when not
> --
>
> Key: HBASE-17261
> URL: https://issues.apache.org/jira/browse/HBASE-17261
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>
> Running ITBLL on tip of branch-1, I see this in log when I try to balance:
> {code}
> 2016-12-05 16:42:21,031 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=46,queue=1,port=16000] 
> balancer.StochasticLoadBalancer: Skipping load balancing because balanced 
> cluster; total cost is 525.2547686174673|
> , sum multiplier is 111087.0 min cost which need balance is 0.05
> {code}
> Its some old nonsense. 
> Does this every time I balance. Can't even force a balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17261) Balancer makes no sense on tip of branch-1: says balanced when not

2016-12-05 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724403#comment-15724403
 ] 

Guanghao Zhang commented on HBASE-17261:


bq. sum multiplier is 111087.0
Did the cluster use all default config in StochasticLoadBalancer?
bq. What you think is up?
We have been used this in our cluster. But I thought the default value should 
be zero. This config can be used only for some power user. I will upload a 
patch for this.

> Balancer makes no sense on tip of branch-1: says balanced when not
> --
>
> Key: HBASE-17261
> URL: https://issues.apache.org/jira/browse/HBASE-17261
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>
> Running ITBLL on tip of branch-1, I see this in log when I try to balance:
> {code}
> 2016-12-05 16:42:21,031 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=46,queue=1,port=16000] 
> balancer.StochasticLoadBalancer: Skipping load balancing because balanced 
> cluster; total cost is 525.2547686174673|
> , sum multiplier is 111087.0 min cost which need balance is 0.05
> {code}
> Its some old nonsense. 
> Does this every time I balance. Can't even force a balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17261) Balancer makes no sense on tip of branch-1: says balanced when not

2016-12-05 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17261:
---
Attachment: HBASE-17261.patch

> Balancer makes no sense on tip of branch-1: says balanced when not
> --
>
> Key: HBASE-17261
> URL: https://issues.apache.org/jira/browse/HBASE-17261
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
> Attachments: HBASE-17261.patch
>
>
> Running ITBLL on tip of branch-1, I see this in log when I try to balance:
> {code}
> 2016-12-05 16:42:21,031 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=46,queue=1,port=16000] 
> balancer.StochasticLoadBalancer: Skipping load balancing because balanced 
> cluster; total cost is 525.2547686174673|
> , sum multiplier is 111087.0 min cost which need balance is 0.05
> {code}
> Its some old nonsense. 
> Does this every time I balance. Can't even force a balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-17261) Balancer makes no sense on tip of branch-1: says balanced when not

2016-12-05 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reassigned HBASE-17261:
--

Assignee: Guanghao Zhang

> Balancer makes no sense on tip of branch-1: says balanced when not
> --
>
> Key: HBASE-17261
> URL: https://issues.apache.org/jira/browse/HBASE-17261
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Guanghao Zhang
> Attachments: HBASE-17261.patch
>
>
> Running ITBLL on tip of branch-1, I see this in log when I try to balance:
> {code}
> 2016-12-05 16:42:21,031 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=46,queue=1,port=16000] 
> balancer.StochasticLoadBalancer: Skipping load balancing because balanced 
> cluster; total cost is 525.2547686174673|
> , sum multiplier is 111087.0 min cost which need balance is 0.05
> {code}
> Its some old nonsense. 
> Does this every time I balance. Can't even force a balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17261) Balancer makes no sense on tip of branch-1: says balanced when not

2016-12-05 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17261:
---
Status: Patch Available  (was: Open)

> Balancer makes no sense on tip of branch-1: says balanced when not
> --
>
> Key: HBASE-17261
> URL: https://issues.apache.org/jira/browse/HBASE-17261
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Guanghao Zhang
> Attachments: HBASE-17261.patch
>
>
> Running ITBLL on tip of branch-1, I see this in log when I try to balance:
> {code}
> 2016-12-05 16:42:21,031 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=46,queue=1,port=16000] 
> balancer.StochasticLoadBalancer: Skipping load balancing because balanced 
> cluster; total cost is 525.2547686174673|
> , sum multiplier is 111087.0 min cost which need balance is 0.05
> {code}
> Its some old nonsense. 
> Does this every time I balance. Can't even force a balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17261) Balancer makes no sense on tip of branch-1: says balanced when not

2016-12-05 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724490#comment-15724490
 ] 

Guanghao Zhang commented on HBASE-17261:


bq. currently on 1.2
branch-1.2 ? But HBASE-15529 was only merged to branch-1 and master. So 
branch-1.2 should not has this problem.

> Balancer makes no sense on tip of branch-1: says balanced when not
> --
>
> Key: HBASE-17261
> URL: https://issues.apache.org/jira/browse/HBASE-17261
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Guanghao Zhang
> Attachments: HBASE-17261.patch
>
>
> Running ITBLL on tip of branch-1, I see this in log when I try to balance:
> {code}
> 2016-12-05 16:42:21,031 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=46,queue=1,port=16000] 
> balancer.StochasticLoadBalancer: Skipping load balancing because balanced 
> cluster; total cost is 525.2547686174673|
> , sum multiplier is 111087.0 min cost which need balance is 0.05
> {code}
> Its some old nonsense. 
> Does this every time I balance. Can't even force a balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17261) Balancer makes no sense on tip of branch-1: says balanced when not

2016-12-05 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724470#comment-15724470
 ] 

Guanghao Zhang commented on HBASE-17261:


{code}
private static final String REGION_REPLICA_HOST_COST_KEY =
"hbase.master.balancer.stochastic.regionReplicaHostCostKey";
private static final float DEFAULT_REGION_REPLICA_HOST_COST_KEY = 10;
{code}
The default region replica cost multiplier is too big and it has the most 
weight in total cost. So when replica cost is small, it can't balance. Upload a 
patch for this.

> Balancer makes no sense on tip of branch-1: says balanced when not
> --
>
> Key: HBASE-17261
> URL: https://issues.apache.org/jira/browse/HBASE-17261
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Guanghao Zhang
> Attachments: HBASE-17261.patch
>
>
> Running ITBLL on tip of branch-1, I see this in log when I try to balance:
> {code}
> 2016-12-05 16:42:21,031 INFO  
> [RpcServer.deafult.FPBQ.Fifo.handler=46,queue=1,port=16000] 
> balancer.StochasticLoadBalancer: Skipping load balancing because balanced 
> cluster; total cost is 525.2547686174673|
> , sum multiplier is 111087.0 min cost which need balance is 0.05
> {code}
> Its some old nonsense. 
> Does this every time I balance. Can't even force a balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17205) Add a metric for the duration of region in transition

2016-11-30 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17205:
---
Affects Version/s: 1.4.0
   2.0.0

> Add a metric for the duration of region in transition
> -
>
> Key: HBASE-17205
> URL: https://issues.apache.org/jira/browse/HBASE-17205
> Project: HBase
>  Issue Type: Improvement
>  Components: Region Assignment
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Attachments: HBASE-17205-branch-1.patch, HBASE-17205-v1.patch, 
> HBASE-17205.patch
>
>
> When work for HBASE-17178, I found there are not a metric for the overall 
> duration of region in transition. When move a region form A to B, the 
> transformation of region state is PENDING_CLOSE => CLOSING => CLOSED => 
> PENDING_OPEN => OPENING => OPENED. When transform old region state to new 
> region state, it update the time stamp to current time. So we can't get the 
> overall transformation's duration of region in transition. Add a rit duration 
> to RegionState for accumulating this metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17205) Add a metric for the duration of region in transition

2016-11-30 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17205:
---
Attachment: HBASE-17205-branch-1.patch

> Add a metric for the duration of region in transition
> -
>
> Key: HBASE-17205
> URL: https://issues.apache.org/jira/browse/HBASE-17205
> Project: HBase
>  Issue Type: Improvement
>  Components: Region Assignment
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Attachments: HBASE-17205-branch-1.patch, HBASE-17205-v1.patch, 
> HBASE-17205.patch
>
>
> When work for HBASE-17178, I found there are not a metric for the overall 
> duration of region in transition. When move a region form A to B, the 
> transformation of region state is PENDING_CLOSE => CLOSING => CLOSED => 
> PENDING_OPEN => OPENING => OPENED. When transform old region state to new 
> region state, it update the time stamp to current time. So we can't get the 
> overall transformation's duration of region in transition. Add a rit duration 
> to RegionState for accumulating this metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17205) Add a metric for the duration of region in transition

2016-11-30 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17205:
---
Attachment: HBASE-17205-v1.patch

Attach v1 patch addressed review comments.

> Add a metric for the duration of region in transition
> -
>
> Key: HBASE-17205
> URL: https://issues.apache.org/jira/browse/HBASE-17205
> Project: HBase
>  Issue Type: Improvement
>  Components: Region Assignment
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Attachments: HBASE-17205-v1.patch, HBASE-17205.patch
>
>
> When work for HBASE-17178, I found there are not a metric for the overall 
> duration of region in transition. When move a region form A to B, the 
> transformation of region state is PENDING_CLOSE => CLOSING => CLOSED => 
> PENDING_OPEN => OPENING => OPENED. When transform old region state to new 
> region state, it update the time stamp to current time. So we can't get the 
> overall transformation's duration of region in transition. Add a rit duration 
> to RegionState for accumulating this metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17205) Add a metric for the duration of region in transition

2016-11-30 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710497#comment-15710497
 ] 

Guanghao Zhang commented on HBASE-17205:


bq.  with the new AM we have the actual time of assign and unassign operation 
for each region and the time of the region in failed open or those kind of 
states.
Look forward to the new AM in 2.0. :)


> Add a metric for the duration of region in transition
> -
>
> Key: HBASE-17205
> URL: https://issues.apache.org/jira/browse/HBASE-17205
> Project: HBase
>  Issue Type: Improvement
>  Components: Region Assignment
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Attachments: HBASE-17205-v1.patch, HBASE-17205.patch
>
>
> When work for HBASE-17178, I found there are not a metric for the overall 
> duration of region in transition. When move a region form A to B, the 
> transformation of region state is PENDING_CLOSE => CLOSING => CLOSED => 
> PENDING_OPEN => OPENING => OPENED. When transform old region state to new 
> region state, it update the time stamp to current time. So we can't get the 
> overall transformation's duration of region in transition. Add a rit duration 
> to RegionState for accumulating this metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17205) Add a metric for the duration of region in transition

2016-11-30 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710926#comment-15710926
 ] 

Guanghao Zhang commented on HBASE-17205:


Failed ut are related to HBASE-17212.

> Add a metric for the duration of region in transition
> -
>
> Key: HBASE-17205
> URL: https://issues.apache.org/jira/browse/HBASE-17205
> Project: HBase
>  Issue Type: Improvement
>  Components: Region Assignment
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Attachments: HBASE-17205-branch-1.patch, HBASE-17205-v1.patch, 
> HBASE-17205.patch
>
>
> When work for HBASE-17178, I found there are not a metric for the overall 
> duration of region in transition. When move a region form A to B, the 
> transformation of region state is PENDING_CLOSE => CLOSING => CLOSED => 
> PENDING_OPEN => OPENING => OPENED. When transform old region state to new 
> region state, it update the time stamp to current time. So we can't get the 
> overall transformation's duration of region in transition. Add a rit duration 
> to RegionState for accumulating this metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17205) Add a metric for the duration of region in transition

2016-11-30 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17205:
---
Attachment: HBASE-17205-v1.patch

There were no precommit job run for v1. Attach again.

> Add a metric for the duration of region in transition
> -
>
> Key: HBASE-17205
> URL: https://issues.apache.org/jira/browse/HBASE-17205
> Project: HBase
>  Issue Type: Improvement
>  Components: Region Assignment
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Attachments: HBASE-17205-branch-1.patch, HBASE-17205-v1.patch, 
> HBASE-17205-v1.patch, HBASE-17205.patch
>
>
> When work for HBASE-17178, I found there are not a metric for the overall 
> duration of region in transition. When move a region form A to B, the 
> transformation of region state is PENDING_CLOSE => CLOSING => CLOSED => 
> PENDING_OPEN => OPENING => OPENED. When transform old region state to new 
> region state, it update the time stamp to current time. So we can't get the 
> overall transformation's duration of region in transition. Add a rit duration 
> to RegionState for accumulating this metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17337) list replication peers request should be routed through master

2017-01-06 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17337:
---
Attachment: HBASE-17337-v1.patch

> list replication peers request should be routed through master
> --
>
> Key: HBASE-17337
> URL: https://issues.apache.org/jira/browse/HBASE-17337
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17337-v1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17337) list replication peers request should be routed through master

2017-01-06 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15804296#comment-15804296
 ] 

Guanghao Zhang commented on HBASE-17337:


Attach a v1 patch. Wait the Hadoop QA result.

> list replication peers request should be routed through master
> --
>
> Key: HBASE-17337
> URL: https://issues.apache.org/jira/browse/HBASE-17337
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17337-v1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17337) list replication peers request should be routed through master

2017-01-06 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17337:
---
Status: Patch Available  (was: Open)

> list replication peers request should be routed through master
> --
>
> Key: HBASE-17337
> URL: https://issues.apache.org/jira/browse/HBASE-17337
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17337-v1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17388) Move ReplicationPeer and other replication related PB messages to the replication.proto

2017-01-04 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17388:
---
Status: Patch Available  (was: Open)

> Move ReplicationPeer and other replication related PB messages to the 
> replication.proto
> ---
>
> Key: HBASE-17388
> URL: https://issues.apache.org/jira/browse/HBASE-17388
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17388.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17388) Move ReplicationPeer and other replication related PB messages to the replication.proto

2017-01-05 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15803258#comment-15803258
 ] 

Guanghao Zhang commented on HBASE-17388:


Pushed to master. Thanks [~enis] for review.

> Move ReplicationPeer and other replication related PB messages to the 
> replication.proto
> ---
>
> Key: HBASE-17388
> URL: https://issues.apache.org/jira/browse/HBASE-17388
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17388.patch, HBASE-17388.patch, HBASE-17388.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17388) Move ReplicationPeer and other replication related PB messages to the replication.proto

2017-01-05 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17388:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Move ReplicationPeer and other replication related PB messages to the 
> replication.proto
> ---
>
> Key: HBASE-17388
> URL: https://issues.apache.org/jira/browse/HBASE-17388
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17388.patch, HBASE-17388.patch, HBASE-17388.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17388) Move ReplicationPeer and other replication related PB messages to the replication.proto

2016-12-28 Thread Guanghao Zhang (JIRA)
Guanghao Zhang created HBASE-17388:
--

 Summary: Move ReplicationPeer and other replication related PB 
messages to the replication.proto
 Key: HBASE-17388
 URL: https://issues.apache.org/jira/browse/HBASE-17388
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Affects Versions: 2.0.0
Reporter: Guanghao Zhang
Assignee: Guanghao Zhang
 Fix For: 2.0.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-17389) Convert all internal usages from ReplicationAdmin to Admin

2016-12-28 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reassigned HBASE-17389:
--

Assignee: Guanghao Zhang

> Convert all internal usages from ReplicationAdmin to Admin
> --
>
> Key: HBASE-17389
> URL: https://issues.apache.org/jira/browse/HBASE-17389
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17336) get/update replication peer config requests should be routed through master

2016-12-28 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784294#comment-15784294
 ] 

Guanghao Zhang commented on HBASE-17336:


Attach a v5 patch fix the copy-paste error.

bq. move ReplicationPeer and other replication related PB messages to the 
replication.proto from zookeeper.proto.
bq. Maybe after all methods moved to Admin, we can do a refactor patch to 
convert internal usages from RA to Admin.
Open new issue HBASE-17388 and HBASE-17389 for these.

> get/update replication peer config requests should be routed through master
> ---
>
> Key: HBASE-17336
> URL: https://issues.apache.org/jira/browse/HBASE-17336
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17336-v1.patch, HBASE-17336-v2.patch, 
> HBASE-17336-v3.patch, HBASE-17336-v4.patch, HBASE-17336-v5.patch
>
>
> As HBASE-11392 description says, we should move replication operations to be 
> routed through master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17336) get/update replication peer config requests should be routed through master

2016-12-28 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784299#comment-15784299
 ] 

Guanghao Zhang commented on HBASE-17336:


Thanks for your suggestion. I will do it in HBASE-17389.

> get/update replication peer config requests should be routed through master
> ---
>
> Key: HBASE-17336
> URL: https://issues.apache.org/jira/browse/HBASE-17336
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17336-v1.patch, HBASE-17336-v2.patch, 
> HBASE-17336-v3.patch, HBASE-17336-v4.patch, HBASE-17336-v5.patch
>
>
> As HBASE-11392 description says, we should move replication operations to be 
> routed through master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17389) Convert all internal usages from ReplicationAdmin to Admin

2016-12-28 Thread Guanghao Zhang (JIRA)
Guanghao Zhang created HBASE-17389:
--

 Summary: Convert all internal usages from ReplicationAdmin to Admin
 Key: HBASE-17389
 URL: https://issues.apache.org/jira/browse/HBASE-17389
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 2.0.0
Reporter: Guanghao Zhang
 Fix For: 2.0.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17336) get/update replication peer config requests should be routed through master

2016-12-28 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17336:
---
Attachment: HBASE-17336-v5.patch

> get/update replication peer config requests should be routed through master
> ---
>
> Key: HBASE-17336
> URL: https://issues.apache.org/jira/browse/HBASE-17336
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17336-v1.patch, HBASE-17336-v2.patch, 
> HBASE-17336-v3.patch, HBASE-17336-v4.patch, HBASE-17336-v5.patch
>
>
> As HBASE-11392 description says, we should move replication operations to be 
> routed through master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17388) Move ReplicationPeer and other replication related PB messages to the replication.proto

2017-01-04 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17388:
---
Attachment: HBASE-17388.patch

Try to trigger Hadoop QA again.

> Move ReplicationPeer and other replication related PB messages to the 
> replication.proto
> ---
>
> Key: HBASE-17388
> URL: https://issues.apache.org/jira/browse/HBASE-17388
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17388.patch, HBASE-17388.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17388) Move ReplicationPeer and other replication related PB messages to the replication.proto

2017-01-04 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17388:
---
Attachment: HBASE-17388.patch

> Move ReplicationPeer and other replication related PB messages to the 
> replication.proto
> ---
>
> Key: HBASE-17388
> URL: https://issues.apache.org/jira/browse/HBASE-17388
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17388.patch, HBASE-17388.patch, HBASE-17388.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17388) Move ReplicationPeer and other replication related PB messages to the replication.proto

2017-01-04 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17388:
---
Status: Open  (was: Patch Available)

> Move ReplicationPeer and other replication related PB messages to the 
> replication.proto
> ---
>
> Key: HBASE-17388
> URL: https://issues.apache.org/jira/browse/HBASE-17388
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17388.patch, HBASE-17388.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17388) Move ReplicationPeer and other replication related PB messages to the replication.proto

2017-01-04 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17388:
---
Status: Patch Available  (was: Open)

> Move ReplicationPeer and other replication related PB messages to the 
> replication.proto
> ---
>
> Key: HBASE-17388
> URL: https://issues.apache.org/jira/browse/HBASE-17388
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17388.patch, HBASE-17388.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17396) Add first async admin impl and implement balance methods

2017-01-09 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17396:
---
Attachment: HBASE-17396-v4.patch

> Add first async admin impl and implement balance methods
> 
>
> Key: HBASE-17396
> URL: https://issues.apache.org/jira/browse/HBASE-17396
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17396-v1.patch, HBASE-17396-v2.patch, 
> HBASE-17396-v3.patch, HBASE-17396-v4.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17336) get/update replication peer config requests should be routed through master

2016-12-28 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17336:
---
Release Note: Get/update replication peer config requests will be routed 
through master.

> get/update replication peer config requests should be routed through master
> ---
>
> Key: HBASE-17336
> URL: https://issues.apache.org/jira/browse/HBASE-17336
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17336-v1.patch, HBASE-17336-v2.patch, 
> HBASE-17336-v3.patch, HBASE-17336-v4.patch, HBASE-17336-v5.patch
>
>
> As HBASE-11392 description says, we should move replication operations to be 
> routed through master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17336) get/update replication peer config requests should be routed through master

2016-12-29 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17336:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master. Thanks all for reviewing.

> get/update replication peer config requests should be routed through master
> ---
>
> Key: HBASE-17336
> URL: https://issues.apache.org/jira/browse/HBASE-17336
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17336-v1.patch, HBASE-17336-v2.patch, 
> HBASE-17336-v3.patch, HBASE-17336-v4.patch, HBASE-17336-v5.patch
>
>
> As HBASE-11392 description says, we should move replication operations to be 
> routed through master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17396) Add first async admin impl and implement balance methods

2016-12-30 Thread Guanghao Zhang (JIRA)
Guanghao Zhang created HBASE-17396:
--

 Summary: Add first async admin impl and implement balance methods
 Key: HBASE-17396
 URL: https://issues.apache.org/jira/browse/HBASE-17396
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang
Assignee: Guanghao Zhang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17396) Add first async admin impl and implement balance methods

2016-12-30 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17396:
---
Affects Version/s: 2.0.0

> Add first async admin impl and implement balance methods
> 
>
> Key: HBASE-17396
> URL: https://issues.apache.org/jira/browse/HBASE-17396
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17396) Add first async admin impl and implement balance methods

2016-12-30 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17396:
---
Status: Patch Available  (was: Open)

> Add first async admin impl and implement balance methods
> 
>
> Key: HBASE-17396
> URL: https://issues.apache.org/jira/browse/HBASE-17396
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17396-v1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17396) Add first async admin impl and implement balance methods

2016-12-30 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17396:
---
Attachment: HBASE-17396-v1.patch

> Add first async admin impl and implement balance methods
> 
>
> Key: HBASE-17396
> URL: https://issues.apache.org/jira/browse/HBASE-17396
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17396-v1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17396) Add first async admin impl and implement balance methods

2016-12-30 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15787455#comment-15787455
 ] 

Guanghao Zhang commented on HBASE-17396:


Attach a initial patch and only implement balance methods. And I used a 
MasterService stub directly in RpcRetryingCaller and didn't use 
MasterKeepAliveConnection.

> Add first async admin impl and implement balance methods
> 
>
> Key: HBASE-17396
> URL: https://issues.apache.org/jira/browse/HBASE-17396
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17396-v1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17396) Add first async admin impl and implement balance methods

2017-01-04 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17396:
---
Attachment: HBASE-17396-v2.patch

> Add first async admin impl and implement balance methods
> 
>
> Key: HBASE-17396
> URL: https://issues.apache.org/jira/browse/HBASE-17396
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17396-v1.patch, HBASE-17396-v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17388) Move ReplicationPeer and other replication related PB messages to the replication.proto

2017-01-04 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17388:
---
Attachment: HBASE-17388.patch

> Move ReplicationPeer and other replication related PB messages to the 
> replication.proto
> ---
>
> Key: HBASE-17388
> URL: https://issues.apache.org/jira/browse/HBASE-17388
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17388.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17388) Move ReplicationPeer and other replication related PB messages to the replication.proto

2017-01-04 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15797836#comment-15797836
 ] 

Guanghao Zhang commented on HBASE-17388:


Move TableCF, ReplicationPeer, ReplicationState, ReplicationHLogPosition to 
Replication.proto.

> Move ReplicationPeer and other replication related PB messages to the 
> replication.proto
> ---
>
> Key: HBASE-17388
> URL: https://issues.apache.org/jira/browse/HBASE-17388
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17388.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17396) Add first async admin impl and implement balance methods

2017-01-04 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17396:
---
Attachment: HBASE-17396-v3.patch

> Add first async admin impl and implement balance methods
> 
>
> Key: HBASE-17396
> URL: https://issues.apache.org/jira/browse/HBASE-17396
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17396-v1.patch, HBASE-17396-v2.patch, 
> HBASE-17396-v3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17337) list replication peers request should be routed through master

2017-01-08 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17337:
---
Attachment: HBASE-17337-v2.patch

> list replication peers request should be routed through master
> --
>
> Key: HBASE-17337
> URL: https://issues.apache.org/jira/browse/HBASE-17337
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17337-v1.patch, HBASE-17337-v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17337) list replication peers request should be routed through master

2017-01-08 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15810601#comment-15810601
 ] 

Guanghao Zhang commented on HBASE-17337:


bq. Add javadoc and InterfaceAudience to ReplicationPeerDescription class.
Added in v2.
bq. If pattern is not null and once the pattern matches the peer id then we can 
break out of the for loop.
This is list operation and there maybe many peers match the pattern.

> list replication peers request should be routed through master
> --
>
> Key: HBASE-17337
> URL: https://issues.apache.org/jira/browse/HBASE-17337
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17337-v1.patch, HBASE-17337-v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17443) Move listReplicated/enableTableRep/disableTableRep methods from ReplicationAdmin to Admin

2017-01-10 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17443:
---
Assignee: Guanghao Zhang
  Status: Patch Available  (was: Open)

> Move listReplicated/enableTableRep/disableTableRep methods from 
> ReplicationAdmin to Admin
> -
>
> Key: HBASE-17443
> URL: https://issues.apache.org/jira/browse/HBASE-17443
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17443-v1.patch
>
>
> We have moved other replication requests to Admin and mark ReplicationAdmin 
> as Deprecated, so listReplicated/enableTableRep/disableTableRep methods need 
> move to Admin, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17443) Move listReplicated/enableTableRep/disableTableRep methods from ReplicationAdmin to Admin

2017-01-10 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17443:
---
Attachment: HBASE-17443-v1.patch

> Move listReplicated/enableTableRep/disableTableRep methods from 
> ReplicationAdmin to Admin
> -
>
> Key: HBASE-17443
> URL: https://issues.apache.org/jira/browse/HBASE-17443
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17443-v1.patch
>
>
> We have moved other replication requests to Admin and mark ReplicationAdmin 
> as Deprecated, so listReplicated/enableTableRep/disableTableRep methods need 
> move to Admin, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17443) Move listReplicated/enableTableRep/disableTableRep methods from ReplicationAdmin to Admin

2017-01-10 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17443:
---
Summary: Move listReplicated/enableTableRep/disableTableRep methods from 
ReplicationAdmin to Admin  (was: Move 
listReplicated/enableTableRep/disableTableRep from ReplicationAdmin to Admin)

> Move listReplicated/enableTableRep/disableTableRep methods from 
> ReplicationAdmin to Admin
> -
>
> Key: HBASE-17443
> URL: https://issues.apache.org/jira/browse/HBASE-17443
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-17443-v1.patch
>
>
> We have moved other replication requests to Admin and mark ReplicationAdmin 
> as Deprecated, so listReplicated/enableTableRep/disableTableRep methods need 
> move to Admin, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17443) Move listReplicated/enableTableRep/disableTableRep from ReplicationAdmin to Admin

2017-01-10 Thread Guanghao Zhang (JIRA)
Guanghao Zhang created HBASE-17443:
--

 Summary: Move listReplicated/enableTableRep/disableTableRep from 
ReplicationAdmin to Admin
 Key: HBASE-17443
 URL: https://issues.apache.org/jira/browse/HBASE-17443
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 2.0.0
Reporter: Guanghao Zhang
 Fix For: 2.0.0


We have moved other replication requests to Admin and mark ReplicationAdmin as 
Deprecated, so listReplicated/enableTableRep/disableTableRep methods need move 
to Admin, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17341) Add a timeout during replication endpoint termination

2016-12-20 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766012#comment-15766012
 ] 

Guanghao Zhang commented on HBASE-17341:


+1 on v2. We met this problem on our cluster, too. The region server shutdown 
hanged when terminate ReplicationSource.

> Add a timeout during replication endpoint termination
> -
>
> Key: HBASE-17341
> URL: https://issues.apache.org/jira/browse/HBASE-17341
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 0.98.23, 1.2.4
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Critical
> Attachments: HBASE-17341.branch-1.1.v1.patch, 
> HBASE-17341.branch-1.1.v2.patch, HBASE-17341.master.v1.patch, 
> HBASE-17341.master.v2.patch
>
>
> In ReplicationSource#terminate(), a Future is obtained from 
> ReplicationEndpoint#stop().  Future.get() is then called, but can potentially 
> hang there if something went wrong in the endpoint stop().
> Hanging there has serious implications, because the thread could potentially 
> be the ZK event thread (e.g. watcher calls 
> ReplicationSourceManager#removePeer() -> ReplicationSource#terminate() -> 
> blocked).  This means no other events in the ZK event queue will get 
> processed, which for HBase means other ZK watches such as replication watch 
> notifications, snapshot watch notifications, even RegionServer shutdown will 
> all get blocked.
> The short term fix addressed here is to simply add a timeout for 
> Future.get().  But the severe consequences seen here perhaps suggest a 
> broader refactoring of the ZKWatcher usage in HBase is in order, to protect 
> against situations like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11392) add/remove peer requests should be routed through master

2016-12-20 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15765810#comment-15765810
 ] 

Guanghao Zhang commented on HBASE-11392:


Failed ut is not related. [~enis] [~ashish singhi] Any more ideas about v6 
patch? Thanks.

> add/remove peer requests should be routed through master
> 
>
> Key: HBASE-11392
> URL: https://issues.apache.org/jira/browse/HBASE-11392
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Guanghao Zhang
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-11392-v1.patch, HBASE-11392-v2.patch, 
> HBASE-11392-v3.patch, HBASE-11392-v4.patch, HBASE-11392-v5.patch, 
> HBASE-11392-v6.patch
>
>
> ReplicationAdmin directly operates over the zookeeper data for replication 
> setup. We should move these operations to be routed through master for two 
> reasons: 
>  - Replication implementation details are exposed to client. We should move 
> most of the replication related classes to hbase-server package. 
>  - Routing the requests through master is the standard practice for all other 
> operations. It allows for decoupling implementation details from the client 
> and code.
> Review board: https://reviews.apache.org/r/54730/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11392) add/remove peer requests should be routed through master

2016-12-20 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-11392:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> add/remove peer requests should be routed through master
> 
>
> Key: HBASE-11392
> URL: https://issues.apache.org/jira/browse/HBASE-11392
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Guanghao Zhang
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-11392-v1.patch, HBASE-11392-v2.patch, 
> HBASE-11392-v3.patch, HBASE-11392-v4.patch, HBASE-11392-v5.patch, 
> HBASE-11392-v6.patch
>
>
> ReplicationAdmin directly operates over the zookeeper data for replication 
> setup. We should move these operations to be routed through master for two 
> reasons: 
>  - Replication implementation details are exposed to client. We should move 
> most of the replication related classes to hbase-server package. 
>  - Routing the requests through master is the standard practice for all other 
> operations. It allows for decoupling implementation details from the client 
> and code.
> Review board: https://reviews.apache.org/r/54730/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17341) Add a timeout during replication endpoint termination

2016-12-20 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15765804#comment-15765804
 ] 

Guanghao Zhang commented on HBASE-17341:


+1 on this. One minor comment.
bq. LOG.warn("Got exception:", e);
Can you add more info to this log?

> Add a timeout during replication endpoint termination
> -
>
> Key: HBASE-17341
> URL: https://issues.apache.org/jira/browse/HBASE-17341
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 0.98.23, 1.2.4
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Critical
> Attachments: HBASE-17341.branch-1.1.v1.patch, 
> HBASE-17341.master.v1.patch
>
>
> In ReplicationSource#terminate(), a Future is obtained from 
> ReplicationEndpoint#stop().  Future.get() is then called, but can potentially 
> hang there if something went wrong in the endpoint stop().
> Hanging there has serious implications, because the thread could potentially 
> be the ZK event thread (e.g. watcher calls 
> ReplicationSourceManager#removePeer() -> ReplicationSource#terminate() -> 
> blocked).  This means no other events in the ZK event queue will get 
> processed, which for HBase means other ZK watches such as replication watch 
> notifications, snapshot watch notifications, even RegionServer shutdown will 
> all get blocked.
> The short term fix addressed here is to simply add a timeout for 
> Future.get().  But the severe consequences seen here perhaps suggest a 
> broader refactoring of the ZKWatcher usage in HBase is in order, to protect 
> against situations like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11392) add/remove peer requests should be routed through master

2016-12-20 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766194#comment-15766194
 ] 

Guanghao Zhang commented on HBASE-11392:


 Pushed to master branch.Thanks all for reviewing.

> add/remove peer requests should be routed through master
> 
>
> Key: HBASE-11392
> URL: https://issues.apache.org/jira/browse/HBASE-11392
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Guanghao Zhang
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-11392-v1.patch, HBASE-11392-v2.patch, 
> HBASE-11392-v3.patch, HBASE-11392-v4.patch, HBASE-11392-v5.patch, 
> HBASE-11392-v6.patch
>
>
> ReplicationAdmin directly operates over the zookeeper data for replication 
> setup. We should move these operations to be routed through master for two 
> reasons: 
>  - Replication implementation details are exposed to client. We should move 
> most of the replication related classes to hbase-server package. 
>  - Routing the requests through master is the standard practice for all other 
> operations. It allows for decoupling implementation details from the client 
> and code.
> Review board: https://reviews.apache.org/r/54730/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17317) [branch-1] The updatePeerConfig method in ReplicationPeersZKImpl didn't update the table-cfs map

2016-12-20 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17317:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> [branch-1] The updatePeerConfig method in ReplicationPeersZKImpl didn't 
> update the table-cfs map
> 
>
> Key: HBASE-17317
> URL: https://issues.apache.org/jira/browse/HBASE-17317
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-17317-branch-1.patch
>
>
> The updatePeerConfig method in ReplicationPeersZKImpl.java
> {code}
>   @Override
>   public void updatePeerConfig(String id, ReplicationPeerConfig newConfig)
> throws ReplicationException {
> ReplicationPeer peer = getPeer(id);
> if (peer == null){
>   throw new ReplicationException("Could not find peer Id " + id);
> }   
> ReplicationPeerConfig existingConfig = peer.getPeerConfig();
> if (newConfig.getClusterKey() != null && 
> !newConfig.getClusterKey().isEmpty() &&
> !newConfig.getClusterKey().equals(existingConfig.getClusterKey())){
>   throw new ReplicationException("Changing the cluster key on an existing 
> peer is not allowed."
>   + " Existing key '" + existingConfig.getClusterKey() + "' does not 
> match new key '"
>   + newConfig.getClusterKey() +
>   "'");
> }   
> String existingEndpointImpl = existingConfig.getReplicationEndpointImpl();
> if (newConfig.getReplicationEndpointImpl() != null &&
> !newConfig.getReplicationEndpointImpl().isEmpty() &&
> !newConfig.getReplicationEndpointImpl().equals(existingEndpointImpl)){
>   throw new ReplicationException("Changing the replication endpoint 
> implementation class " +
>   "on an existing peer is not allowed. Existing class '"
>   + existingConfig.getReplicationEndpointImpl()
>   + "' does not match new class '" + 
> newConfig.getReplicationEndpointImpl() + "'");
> }   
> //Update existingConfig's peer config and peer data with the new values, 
> but don't touch config
> // or data that weren't explicitly changed
> existingConfig.getConfiguration().putAll(newConfig.getConfiguration());
> existingConfig.getPeerData().putAll(newConfig.getPeerData());
>// Bug. We should update table-cfs map, too.
> try {
>   ZKUtil.setData(this.zookeeper, getPeerNode(id),
>   ReplicationSerDeHelper.toByteArray(existingConfig));
> }   
> catch(KeeperException ke){
>   throw new ReplicationException("There was a problem trying to save 
> changes to the " +
>   "replication peer " + id, ke);
> }   
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17317) [branch-1] The updatePeerConfig method in ReplicationPeersZKImpl didn't update the table-cfs map

2016-12-20 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15763933#comment-15763933
 ] 

Guanghao Zhang commented on HBASE-17317:


Pushed to branch-1. Thanks [~tedyu] for reviewing.

> [branch-1] The updatePeerConfig method in ReplicationPeersZKImpl didn't 
> update the table-cfs map
> 
>
> Key: HBASE-17317
> URL: https://issues.apache.org/jira/browse/HBASE-17317
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-17317-branch-1.patch
>
>
> The updatePeerConfig method in ReplicationPeersZKImpl.java
> {code}
>   @Override
>   public void updatePeerConfig(String id, ReplicationPeerConfig newConfig)
> throws ReplicationException {
> ReplicationPeer peer = getPeer(id);
> if (peer == null){
>   throw new ReplicationException("Could not find peer Id " + id);
> }   
> ReplicationPeerConfig existingConfig = peer.getPeerConfig();
> if (newConfig.getClusterKey() != null && 
> !newConfig.getClusterKey().isEmpty() &&
> !newConfig.getClusterKey().equals(existingConfig.getClusterKey())){
>   throw new ReplicationException("Changing the cluster key on an existing 
> peer is not allowed."
>   + " Existing key '" + existingConfig.getClusterKey() + "' does not 
> match new key '"
>   + newConfig.getClusterKey() +
>   "'");
> }   
> String existingEndpointImpl = existingConfig.getReplicationEndpointImpl();
> if (newConfig.getReplicationEndpointImpl() != null &&
> !newConfig.getReplicationEndpointImpl().isEmpty() &&
> !newConfig.getReplicationEndpointImpl().equals(existingEndpointImpl)){
>   throw new ReplicationException("Changing the replication endpoint 
> implementation class " +
>   "on an existing peer is not allowed. Existing class '"
>   + existingConfig.getReplicationEndpointImpl()
>   + "' does not match new class '" + 
> newConfig.getReplicationEndpointImpl() + "'");
> }   
> //Update existingConfig's peer config and peer data with the new values, 
> but don't touch config
> // or data that weren't explicitly changed
> existingConfig.getConfiguration().putAll(newConfig.getConfiguration());
> existingConfig.getPeerData().putAll(newConfig.getPeerData());
>// Bug. We should update table-cfs map, too.
> try {
>   ZKUtil.setData(this.zookeeper, getPeerNode(id),
>   ReplicationSerDeHelper.toByteArray(existingConfig));
> }   
> catch(KeeperException ke){
>   throw new ReplicationException("There was a problem trying to save 
> changes to the " +
>   "replication peer " + id, ke);
> }   
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17328) Properly dispose of looped replication peers

2016-12-18 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15760485#comment-15760485
 ] 

Guanghao Zhang commented on HBASE-17328:


Seems the metrics will be clear twice?

> Properly dispose of looped replication peers
> 
>
> Key: HBASE-17328
> URL: https://issues.apache.org/jira/browse/HBASE-17328
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.4.0, 0.98.23
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Critical
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.5, 0.98.24, 1.1.9
>
> Attachments: HBASE-17328-1.1.v1.patch, HBASE-17328-master.v1.patch, 
> HBASE-17328-master.v2.patch, HBASE-17328.branch-1.1.v2.patch, 
> HBASE-17328.master.v3.patch
>
>
> When adding a looped replication peer (clusterId == peerClusterId), the 
> following code terminates the replication source thread, but since the source 
> manager still holds a reference, WALs continue to get enqueued, and never get 
> cleaned because they're stuck in the queue, leading to an unsustainable 
> buildup.  Furthermore, the replication statistics thread will continue to 
> print statistics for the terminated source.
> {code}
> if (clusterId.equals(peerClusterId) && 
> !replicationEndpoint.canReplicateToSameCluster()) {
>   this.terminate("ClusterId " + clusterId + " is replicating to itself: 
> peerClusterId "
>   + peerClusterId + " which is not allowed by ReplicationEndpoint:"
>   + replicationEndpoint.getClass().getName(), null, false);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17288) Add warn log for huge Cell and huge row

2016-12-20 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15764057#comment-15764057
 ] 

Guanghao Zhang commented on HBASE-17288:


bq. You init these new vars when parallel seek enabled. I believe simple 
mistake it is and not intended by you.
Sorry for this mistake. I will upload a new patch later.
bq. Any way 1st cell which causes the break in the row size check, will make 
into the log.
Nice, but the row is still needed? We need this to find the huge row.
bq. Better we can do the row size check and the end after considering all cells 
so that we can get exactly the size of the row? Or is that not possible as per 
loop here?
This is not the real row size. When the scan set batch, then it is only the 
batch cell's size. Now our scan support heartbeat and ScannerConnext has size 
limit and time limit. Maybe it doesn't need huge row warn... I will check the 
latest code in master branch.

Thanks for your reviewing. :)

> Add warn log for huge Cell and huge row
> ---
>
> Key: HBASE-17288
> URL: https://issues.apache.org/jira/browse/HBASE-17288
> Project: HBase
>  Issue Type: Improvement
>  Components: scan
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Attachments: HBASE-17288-v1.patch, HBASE-17288-v2.patch, 
> HBASE-17288.patch
>
>
> Some log examples from our production cluster.
> {code}
> 2016-12-10,17:08:11,478 WARN 
> org.apache.hadoop.hbase.regionserver.StoreScanner: adding a HUGE KV into 
> result list, kv size:1253360, 
> kv:10567114001-1-c/R:r1/1481360887152/Put/vlen=1253245/ts=923099, from 
> table X
> 2016-12-10,17:08:16,724 WARN 
> org.apache.hadoop.hbase.regionserver.StoreScanner: adding a HUGE KV into 
> result list, kv size:1048680, 
> kv:0220459/I:i_0/1481360889551/Put/vlen=1048576/ts=13642, from table XX
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11392) add/remove peer requests should be routed through master

2016-12-20 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-11392:
---
Attachment: HBASE-11392-v6.patch

> add/remove peer requests should be routed through master
> 
>
> Key: HBASE-11392
> URL: https://issues.apache.org/jira/browse/HBASE-11392
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Guanghao Zhang
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-11392-v1.patch, HBASE-11392-v2.patch, 
> HBASE-11392-v3.patch, HBASE-11392-v4.patch, HBASE-11392-v5.patch, 
> HBASE-11392-v6.patch
>
>
> ReplicationAdmin directly operates over the zookeeper data for replication 
> setup. We should move these operations to be routed through master for two 
> reasons: 
>  - Replication implementation details are exposed to client. We should move 
> most of the replication related classes to hbase-server package. 
>  - Routing the requests through master is the standard practice for all other 
> operations. It allows for decoupling implementation details from the client 
> and code.
> Review board: https://reviews.apache.org/r/54730/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11392) add/remove peer requests should be routed through master

2016-12-20 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15764040#comment-15764040
 ] 

Guanghao Zhang commented on HBASE-11392:


Attach a v5 patch addressed review comments.

> add/remove peer requests should be routed through master
> 
>
> Key: HBASE-11392
> URL: https://issues.apache.org/jira/browse/HBASE-11392
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Guanghao Zhang
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-11392-v1.patch, HBASE-11392-v2.patch, 
> HBASE-11392-v3.patch, HBASE-11392-v4.patch, HBASE-11392-v5.patch
>
>
> ReplicationAdmin directly operates over the zookeeper data for replication 
> setup. We should move these operations to be routed through master for two 
> reasons: 
>  - Replication implementation details are exposed to client. We should move 
> most of the replication related classes to hbase-server package. 
>  - Routing the requests through master is the standard practice for all other 
> operations. It allows for decoupling implementation details from the client 
> and code.
> Review board: https://reviews.apache.org/r/54730/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17348) Remove the unused hbase.replication from javadoc/comment completely

2016-12-20 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17348:
---
Status: Patch Available  (was: Open)

> Remove the unused hbase.replication from javadoc/comment completely
> ---
>
> Key: HBASE-17348
> URL: https://issues.apache.org/jira/browse/HBASE-17348
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Trivial
> Attachments: HBASE-17348.patch
>
>
> Configuration hbase.replication has been removed by HBASE-16040. But there 
> are still some hbase.replication left in javadoc of ReplicationAdmin, 
> Admin.proto and shell.rb. Let's remove it completely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11392) add/remove peer requests should be routed through master

2016-12-20 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-11392:
---
Attachment: HBASE-11392-v5.patch

> add/remove peer requests should be routed through master
> 
>
> Key: HBASE-11392
> URL: https://issues.apache.org/jira/browse/HBASE-11392
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Guanghao Zhang
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-11392-v1.patch, HBASE-11392-v2.patch, 
> HBASE-11392-v3.patch, HBASE-11392-v4.patch, HBASE-11392-v5.patch
>
>
> ReplicationAdmin directly operates over the zookeeper data for replication 
> setup. We should move these operations to be routed through master for two 
> reasons: 
>  - Replication implementation details are exposed to client. We should move 
> most of the replication related classes to hbase-server package. 
>  - Routing the requests through master is the standard practice for all other 
> operations. It allows for decoupling implementation details from the client 
> and code.
> Review board: https://reviews.apache.org/r/54730/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17348) Remove the unused hbase.replication from javadoc/comment completely

2016-12-20 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17348:
---
Attachment: HBASE-17348-v1.patch

Update the generated java files too.

> Remove the unused hbase.replication from javadoc/comment completely
> ---
>
> Key: HBASE-17348
> URL: https://issues.apache.org/jira/browse/HBASE-17348
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Trivial
> Attachments: HBASE-17348-v1.patch, HBASE-17348.patch
>
>
> Configuration hbase.replication has been removed by HBASE-16040. But there 
> are still some hbase.replication left in javadoc of ReplicationAdmin, 
> Admin.proto and shell.rb. Let's remove it completely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17348) Remove the unused hbase.replication from javadoc/comment completely

2016-12-20 Thread Guanghao Zhang (JIRA)
Guanghao Zhang created HBASE-17348:
--

 Summary: Remove the unused hbase.replication from javadoc/comment 
completely
 Key: HBASE-17348
 URL: https://issues.apache.org/jira/browse/HBASE-17348
 Project: HBase
  Issue Type: Improvement
Reporter: Guanghao Zhang
Assignee: Guanghao Zhang
Priority: Trivial


Configuration hbase.replication has been removed by HBASE-16040. But there are 
still some hbase.replication left in javadoc of ReplicationAdmin, Admin.proto 
and shell.rb. Let's remove it completely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17348) Remove the unused hbase.replication from javadoc/comment completely

2016-12-20 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-17348:
---
Attachment: HBASE-17348.patch

> Remove the unused hbase.replication from javadoc/comment completely
> ---
>
> Key: HBASE-17348
> URL: https://issues.apache.org/jira/browse/HBASE-17348
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Trivial
> Attachments: HBASE-17348.patch
>
>
> Configuration hbase.replication has been removed by HBASE-16040. But there 
> are still some hbase.replication left in javadoc of ReplicationAdmin, 
> Admin.proto and shell.rb. Let's remove it completely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    3   4   5   6   7   8   9   10   11   12   >