busbey commented on a change in pull request #1167: Hbase 18095: Zookeeper-less 
client connection implementation
URL: https://github.com/apache/hbase/pull/1167#discussion_r378922404
 
 

 ##########
 File path: src/main/asciidoc/_chapters/configuration.adoc
 ##########
 @@ -563,38 +563,63 @@ Changes here will require a cluster restart for HBase to 
notice the change thoug
 
 If you are running HBase in standalone mode, you don't need to configure 
anything for your client to work provided that they are all on the same machine.
 
-Since the HBase Master may move around, clients bootstrap by looking to 
ZooKeeper for current critical locations.
-ZooKeeper is where all these values are kept.
-Thus clients require the location of the ZooKeeper ensemble before they can do 
anything else.
-Usually this ensemble location is kept out in the _hbase-site.xml_ and is 
picked up by the client from the `CLASSPATH`.
+Starting release 3.0.0, the default connection registry has been switched to a 
master based implementation. Refer to <<client.masterregistry>> for more 
details about
+what a connection registry is and implications of this change. Depending on 
your HBase version, following is the expected minimal client configuration.
 
-If you are configuring an IDE to run an HBase client, you should include the 
_conf/_ directory on your classpath so _hbase-site.xml_ settings can be found 
(or add _src/test/resources_ to pick up the hbase-site.xml used by tests).
+==== Up until 2.x.y releases
+In 2.x.y releases, the default connection registry was based on ZooKeeper as 
the source of truth. This means that the clients always looked up ZooKeeper 
znodes to fetch
+the required metadata. For example, if an active master crashed and the a new 
master is elected, clients looked up the master znode to fetch
+the active master address (similarly for meta locations). This meant that the 
clients needed to have access to ZooKeeper and need to know
+the ZooKeeper ensemble information before they can do anything. This can be 
configured in the client configuration xml as follows:
 
-For Java applications using Maven, including the hbase-shaded-client module is 
the recommended dependency when connecting to a cluster:
 [source,xml]
 ----
-<dependency>
-  <groupId>org.apache.hbase</groupId>
-  <artifactId>hbase-shaded-client</artifactId>
-  <version>2.0.0</version>
-</dependency>
+<?xml version="1.0"?>
+<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
+<configuration>
+  <property>
+    <name>hbase.zookeeper.quorum</name>
+    <value>example1,example2,example3</value>
+    <description> Zookeeper ensemble information</description>
+  </property>
+</configuration>
 ----
 
-A basic example _hbase-site.xml_ for client only may look as follows:
+==== Starting 3.0.0 release
+
+The default implementation was switched to a master based connection registry. 
With this implementation, clients always contact the active or
+stand-by master RPC end points to fetch the the connection registry 
information. This means that the clients should have access to the list of 
active and master
+end points before they can do anything. This can be configured in the client 
configuration xml as follows:
 
 Review comment:
   This specifically doesn't need to be _all_ of the Masters configured for the 
cluster, right?
   
   So if I wanted to hedge against clients ddosing masters to the point that I 
can't get an active master for the cluster I could e.g. only give them half of 
the Masters.
   
   Presuming that's the case, a brief entry in the troubleshooting section that 
basically says to do this in case clients are causing masters to die would help 
us get ahead of that concern.
   
   I say this as someone who has had to troubleshoot clusters where bad client 
behavior essentially meant every hbase operation hit ZK for meta and as a 
result said cluster would die.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to