[jira] [Comment Edited] (HBASE-10569) Co-locate meta and master

2014-05-14 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997647#comment-13997647
 ] 

Jimmy Xiang edited comment on HBASE-10569 at 5/14/14 3:31 PM:
--

Totally agree with what Stack and Matteo said.

As to [~enis]'s question:

bq. 2) if we are also allowing to completely disable this feature (as in the 
other jira), will there still be benefit for this?

Big features usually come in with an option to disable it at first. This is a 
basic idea to introduce great features with smooth migration paths for users at 
the beginning, right?


was (Author: jxiang):
Totally agree with what Stack and Matteo said. 

bq. 2) if we are also allowing to completely disable this feature (as in the 
other jira), will there still be benefit for this?

Big features usually come in with an option to disable it at first. This is a 
basic idea to introduce great features with smooth migration paths for users at 
the beginning, right?

 Co-locate meta and master
 -

 Key: HBASE-10569
 URL: https://issues.apache.org/jira/browse/HBASE-10569
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.99.0

 Attachments: Co-locateMetaAndMasterHBASE-10569.pdf, 
 hbase-10569_v1.patch, hbase-10569_v2.patch, hbase-10569_v3.1.patch, 
 hbase-10569_v3.patch, master_rs.pdf


 I was thinking simplifying/improving the region assignments. The first step 
 is to co-locate the meta and the master as many people agreed on HBASE-5487.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HBASE-10569) Co-locate meta and master

2014-03-13 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933578#comment-13933578
 ] 

Jimmy Xiang edited comment on HBASE-10569 at 3/13/14 5:56 PM:
--

Attached a patch that passed unit tests, integration tests (including ITBLL), 
and some live cluster tests. Will put it on RB soon when RB is up.

Here is what I have done in this patch:
# Moved RPC related code out of HRegionServer and HMaster so that they are 
smaller for easier change/maintenance.
# Make HMaster extends HRegionServer so that HMaster is also a HRegionServer, 
removed duplicate code/parameters.
# Due to 2, HMaster#getMetrics is renamed to getMasterMetrics to avoid naming 
conflict with HRegionServer#getMetrics. The same has been done to 
HMaster#getCoprocessors, #getCoprocessorHost.
# Added HRegionServer#getRpcServices and HMaster#getMasterRpcServices to expose 
the RPC functionalities.
# Changed references related to 3 and 4 (a lot, especially in tests).
# HMaster and HRegionServer share one RPC server and one InfoServer.
# RpcServiceInterface is changed a little. Method #startThreads and #openServer 
are removed since backup master doesn’t hold the RPC server any more. A 
parameter HMaster#serviceStarted is introduced to indicate if a master is 
active so as ServerNotRunningYetException can be thrown before a master is 
active.
# Master recovery in case of ZK connection loss is removed since it doesn’t 
recover listeners added in HRegionServer. We can get this feature back if 
needed. The other reason I didn’t try to get it back is because we are going to 
use raft to choose active master instead of relying on ZK.
# HRegionServer on the active HMaster communicates with the active HMaster 
directly instead of going through the RPC. Shortcut helps.
# Master(active/backup) web UI contains info about the corresponding region 
server.
# Backup master moves users regions away (and meta/namespace region to the 
master if already assigned somewhere else) after becoming active.
# Integration testing doesn’t restart the master as a region server, or restart 
the region server that holds the meta. One reason is because the startup script 
can’t tell if a region server should be master.

Here is a list of things to be done (in separate issues):
# Need to make sure the master listens to the old ports (RPC + webUI) too, so 
as to support rolling upgrade from old versions (0.96+), and be backward 
compatible.
# Need to consolidate(?) chores/threads/handlers in master/regionserver, so 
that the active master manager in the backup master has a high priority so that 
it can grab the ZK node faster, before we move to raft.
# Clean up MetaServerShutdownHandler and HMaster#assignMeta in next major 
release when rolling upgrade is not an issue any more. This should be done much 
later.



was (Author: jxiang):
Attached a patch that passed unit tests, integration tests (including ITBLL), 
and some live cluster tests. Will put it on RB soon.

Here is what I have done in this patch:
* Moved RPC related code out of HRegionServer and HMaster so that they are 
smaller for easier change/maintenance.
* Make HMaster extends HRegionServer so that HMaster is also a HRegionServer, 
removed duplicate code/parameters.
* Due to B, HMaster#getMetrics is renamed to getMasterMetrics to avoid naming 
conflict with HRegionServer#getMetrics. The same has been done to 
HMaster#getCoprocessors, #getCoprocessorHost.
* Added HRegionServer#getRpcServices and HMaster#getMasterRpcServices to expose 
the RPC functionalities.
* Changed references related to C and D (a lot, especially in tests).
* HMaster and HRegionServer share one RPC server and one InfoServer.
* RpcServiceInterface is changed a little. Method #startThreads and #openServer 
are removed since backup master doesn’t hold the RPC server any more. A 
parameter HMaster#serviceStarted is introduced to indicate if a master is 
active so as ServerNotRunningYetException can be thrown before a master is 
active.
* Master recovery in case of ZK connection loss is removed since it doesn’t 
recover listeners added in HRegionServer. We can get this feature back if 
needed. The other reason I didn’t try to get it back is because we are going to 
use raft to choose active master instead of relying on ZK.
* HRegionServer on the active HMaster communicates with the active HMaster 
directly instead of going through the RPC. Shortcut helps.
* Master(active/backup) web UI contains info about the corresponding region 
server.
* Backup master moves users regions away (and meta/namespace region to the 
master if already assigned somewhere else) after becoming active.
* Integration testing doesn’t restart the master as a region server, or restart 
the region server that holds the meta. One reason is because the startup script 
can’t tell if a region server should be master.

Here is a list of things