[ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-20794:
----------------------------------
    Description: 
Right now, multiple metastore services can be specified in hive.metastore.uris 
configuration, but that list is static and can not be modified dynamically. Use 
Zookeeper for dynamic service discovery of metastore.
h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)

The Zookeeper related code (for service discovery) accesses Zookeeper 
parameters directly from HiveConf. The class is changed so that it could be 
used for both HiveServer2 and Metastore server and works with both the 
configurations. Following methods from HiveServer2 are now moved into 
ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
removeServerInstanceFromZooKeeper
h3. HiveMetaStore conf changes
 # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
(hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are used 
as ZooKeeper quorum. When it's set to be empty, the URIs are used to locate the 
metastore directly.
 # Here's list of Hiveserver2's parameters and their proposed metastore conf 
counterparts. It looks odd that the Metastore related configurations do not 
have their macros start with METASTORE, but start with THRIFT. I have just 
followed naming convention used for other parameters.
#* HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
(hive.metastore.zookeeper.namespace)
#* HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
(hive.metastore.zookeeper.client.port)
#* HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
(hive.metastore.zookeeper.connection.timeout)
#* HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
(hive.metastore.zookeeper.connection.max.retries)
#* HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
(hive.metastore.zookeeper.connection.basesleeptime)

Following Hive ZK configurations seem to be related to managing locks and seem 
irrelevant for MS ZK.
# HIVE_ZOOKEEPER_SESSION_TIMEOUT
# HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES

Since there is no configuration to be published, HIVE_ZOOKEEPER_PUBLISH_CONFIGS 
does not have a THRIFT counterpart.
h3. HiveMetaStore class changes
# startMetaStore should also register the instance with Zookeeper, when 
configured.
# When shutting a metastore server down it should deregister itself from 
Zookeeper, when configured.
# These changes use the refactored code described above.

h3. HiveMetaStoreClient class changes
When service discovery mode is zookeeper, we fetch the metatstore URIs from the 
specified ZooKeeper and treat those as if they were specified in THRIFT_URIS 
i.e. use the existing mechanisms to choose a metastore server to connect to and 
establish a connection.

  was:Right now, multiple metastore services can be specified in 
hive.metastore.uris configuration, but that list is static and can not be 
modified dynamically. Use Zookeeper for dynamic service discovery of metastore.


> Use Zookeeper for metastore service discovery
> ---------------------------------------------
>
>                 Key: HIVE-20794
>                 URL: https://issues.apache.org/jira/browse/HIVE-20794
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ashutosh Bapat
>            Assignee: Ashutosh Bapat
>            Priority: Major
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
> #* HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
> #* HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
> #* HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
> #* HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
> #* HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
> # HIVE_ZOOKEEPER_SESSION_TIMEOUT
> # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
> # startMetaStore should also register the instance with Zookeeper, when 
> configured.
> # When shutting a metastore server down it should deregister itself from 
> Zookeeper, when configured.
> # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from 
> the specified ZooKeeper and treat those as if they were specified in 
> THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to 
> connect to and establish a connection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to