Author: pzampino
Date: Sat Jun 23 00:30:39 2018
New Revision: 1834176

URL: http://svn.apache.org/viewvc?rev=1834176&view=rev
Log:
Documented ZooKeeper-based Knox HA support

Modified:
    knox/site/books/knox-1-1-0/user-guide.html
    knox/trunk/books/1.1.0/config_ha.md

Modified: knox/site/books/knox-1-1-0/user-guide.html
URL: 
http://svn.apache.org/viewvc/knox/site/books/knox-1-1-0/user-guide.html?rev=1834176&r1=1834175&r2=1834176&view=diff
==============================================================================
--- knox/site/books/knox-1-1-0/user-guide.html (original)
+++ knox/site/books/knox-1-1-0/user-guide.html Sat Jun 23 00:30:39 2018
@@ -3220,12 +3220,21 @@ exit
 </code></pre><p>Copy knox.service.keytab created on KDC host on to your Knox 
host <code>{GATEWAY_HOME}/conf/knox.service.keytab</code></p>
 <pre><code>chown knox knox.service.keytab
 chmod 400 knox.service.keytab
-</code></pre><h4><a 
id="Update+`krb5.conf`+at+`{GATEWAY_HOME}/conf/krb5.conf`+on+Knox+host">Update 
<code>krb5.conf</code> at <code>{GATEWAY_HOME}/conf/krb5.conf</code> on Knox 
host</a> <a 
href="#Update+`krb5.conf`+at+`{GATEWAY_HOME}/conf/krb5.conf`+on+Knox+host"><img 
src="markbook-section-link.png"/></a></h4><p>You could copy the 
<code>{GATEWAY_HOME}/templates/krb5.conf</code> file provided in the Knox 
binary download and customize it to suit your cluster.</p><h4><a 
id="Update+`krb5JAASLogin.conf`+at+`/etc/knox/conf/krb5JAASLogin.conf`+on+Knox+host">Update
 <code>krb5JAASLogin.conf</code> at 
<code>/etc/knox/conf/krb5JAASLogin.conf</code> on Knox host</a> <a 
href="#Update+`krb5JAASLogin.conf`+at+`/etc/knox/conf/krb5JAASLogin.conf`+on+Knox+host"><img
 src="markbook-section-link.png"/></a></h4><p>You could copy the 
<code>{GATEWAY_HOME}/templates/krb5JAASLogin.conf</code> file provided in the 
Knox binary download and customize it to suit your cluster.</p><h4><a 
id="Update+`gateway-site.xm
 l`+on+Knox+host">Update <code>gateway-site.xml</code> on Knox host</a> <a 
href="#Update+`gateway-site.xml`+on+Knox+host"><img 
src="markbook-section-link.png"/></a></h4><p>Update 
<code>conf/gateway-site.xml</code> in your Knox installation and set the value 
of <code>gateway.hadoop.kerberos.secured</code> to true.</p><h4><a 
id="Restart+Knox">Restart Knox</a> <a href="#Restart+Knox"><img 
src="markbook-section-link.png"/></a></h4><p>After you do the above 
configurations and restart Knox, Knox would use SPNego to authenticate with 
Hadoop services and Oozie. There is no change in the way you make calls to Knox 
whether you use Curl or Knox DSL.</p><h3><a id="High+Availability">High 
Availability</a> <a href="#High+Availability"><img 
src="markbook-section-link.png"/></a></h3><p>This describes how Knox itself can 
be made highly available.</p><h4><a id="Configure+Knox+instances">Configure 
Knox instances</a> <a href="#Configure+Knox+instances"><img 
src="markbook-section-link.png"/></a></h4><p>A
 ll Knox instances must be synced to use the same topology credential 
keystores. These files are located under 
<code>{GATEWAY_HOME}/conf/security/keystores/{TOPOLOGY_NAME}-credentials.jceks</code>.
 They are generated after the first topology deployment. Currently these files 
need to be synced manually. Here are the steps to sync topologies credentials 
keystores:</p>
+</code></pre><h4><a 
id="Update+`krb5.conf`+at+`{GATEWAY_HOME}/conf/krb5.conf`+on+Knox+host">Update 
<code>krb5.conf</code> at <code>{GATEWAY_HOME}/conf/krb5.conf</code> on Knox 
host</a> <a 
href="#Update+`krb5.conf`+at+`{GATEWAY_HOME}/conf/krb5.conf`+on+Knox+host"><img 
src="markbook-section-link.png"/></a></h4><p>You could copy the 
<code>{GATEWAY_HOME}/templates/krb5.conf</code> file provided in the Knox 
binary download and customize it to suit your cluster.</p><h4><a 
id="Update+`krb5JAASLogin.conf`+at+`/etc/knox/conf/krb5JAASLogin.conf`+on+Knox+host">Update
 <code>krb5JAASLogin.conf</code> at 
<code>/etc/knox/conf/krb5JAASLogin.conf</code> on Knox host</a> <a 
href="#Update+`krb5JAASLogin.conf`+at+`/etc/knox/conf/krb5JAASLogin.conf`+on+Knox+host"><img
 src="markbook-section-link.png"/></a></h4><p>You could copy the 
<code>{GATEWAY_HOME}/templates/krb5JAASLogin.conf</code> file provided in the 
Knox binary download and customize it to suit your cluster.</p><h4><a 
id="Update+`gateway-site.xm
 l`+on+Knox+host">Update <code>gateway-site.xml</code> on Knox host</a> <a 
href="#Update+`gateway-site.xml`+on+Knox+host"><img 
src="markbook-section-link.png"/></a></h4><p>Update 
<code>conf/gateway-site.xml</code> in your Knox installation and set the value 
of <code>gateway.hadoop.kerberos.secured</code> to true.</p><h4><a 
id="Restart+Knox">Restart Knox</a> <a href="#Restart+Knox"><img 
src="markbook-section-link.png"/></a></h4><p>After you do the above 
configurations and restart Knox, Knox would use SPNego to authenticate with 
Hadoop services and Oozie. There is no change in the way you make calls to Knox 
whether you use Curl or Knox DSL.</p><h3><a id="High+Availability">High 
Availability</a> <a href="#High+Availability"><img 
src="markbook-section-link.png"/></a></h3><p>This describes how Knox itself can 
be made highly available.</p><p>All Knox instances must be synced to use the 
same topology credential keystores. These files are located under 
<code>{GATEWAY_HOME}/conf/security/keys
 tores/{TOPOLOGY_NAME}-credentials.jceks</code>. They are generated after the 
first topology deployment.</p><p>In addition to these topology-specific 
credentials, gateway credentials and topologies must also be kept in-sync for 
Knox to operate in an HA manner.</p><h4><a 
id="Manually+Synchronize+Knox+Instances">Manually Synchronize Knox 
Instances</a> <a href="#Manually+Synchronize+Knox+Instances"><img 
src="markbook-section-link.png"/></a></h4><p>Here are the steps to manually 
sync topology credential keystores:</p>
 <ol>
   <li>Choose a Knox instance that will be the source for topology credential 
keystores. Let&rsquo;s call it <em>keystores master</em></li>
   <li>Replace the topology credential keystores in the other Knox instances 
with topology credential keystores from the <em>keystores master</em></li>
   <li>Restart Knox instances</li>
-</ol><h4><a 
id="High+Availability+with+Apache+HTTP+Server+++mod_proxy+++mod_proxy_balancer">High
 Availability with Apache HTTP Server + mod_proxy + mod_proxy_balancer</a> <a 
href="#High+Availability+with+Apache+HTTP+Server+++mod_proxy+++mod_proxy_balancer"><img
 src="markbook-section-link.png"/></a></h4><h5><a id="1+-+Requirements">1 - 
Requirements</a> <a href="#1+-+Requirements"><img 
src="markbook-section-link.png"/></a></h5><h6><a 
id="openssl-devel">openssl-devel</a> <a href="#openssl-devel"><img 
src="markbook-section-link.png"/></a></h6><p>openssl-devel is required for 
Apache Module mod_ssl.</p>
+</ol><p>Manually synchronizing the gateway credentials and topologies involves 
using ssh/scp to copy the topology-related files to all the participating Knox 
instances, and running the Knox CLI on each participating instance to define 
the credential aliases.</p><p>This manual process can be tedious and 
error-prone. As such, ZooKeeper-based HA is recommended to simplify the 
management of these deployments.</p><h4><a 
id="High+Availability+with+Apache+ZooKeeper">High Availability with Apache 
ZooKeeper</a> <a href="#High+Availability+with+Apache+ZooKeeper"><img 
src="markbook-section-link.png"/></a></h4><p>Rather than manually keeping Knox 
HA instances in sync (in terms of credentials and topology), Knox can get 
it&rsquo;s state from Apache ZooKeeper. By configuring all the Knox instances 
to monitor the same ZooKeeper ensemble, they can be kept in-sync by modifying 
the topology-related configuration and/or credential aliases at only one of the 
instances (using the Admin UI, Admin API, or
  Knox CLI).</p><h5><a 
id="What+is+Automatically+Synchronized+Across+Instances?">What is Automatically 
Synchronized Across Instances?</a> <a 
href="#What+is+Automatically+Synchronized+Across+Instances?"><img 
src="markbook-section-link.png"/></a></h5>
+<ul>
+  <li>Provider Configurations</li>
+  <li>Descriptors</li>
+  <li>Credential Aliases</li>
+</ul><p>When a provider configuration or descriptor is added or updated to the 
ZooKeeper ensemble, all of the participating Knox instances will get the 
change, and the affected topologies will be [re]generated and [re]deployed. 
Similarly, if one of these is deleted, the affected topologies will be deleted 
and undeployed.</p><p>When provider configurations and descriptors are added, 
modified or removed using the Admin UI or API (when the Knox instance is 
configured to monitor a ZooKeeper ensemble), then those changes will be 
automatically reflected in the associated ZooKeeper ensemble. Those changes 
will subsequently be consumed by all the other Knox instances monitoring that 
ensemble. By using the Admin UI or API, ssh/scp access to the Knox hosts can be 
avoided completely for the purpose of effecting topology 
changes.</p><p>Similarly, when the Knox CLI is used to create or delete a 
gateway alias (when the Knox instance is configured to monitor a ZooKeeper 
ensemble), that alias chang
 e is reflected in the ZooKeeper ensemble, and all other Knox instances 
montoring that ensemble will apply the change.</p><h5><a 
id="What+is+NOT+Automatically+Synchronized+Across+Instances?">What is NOT 
Automatically Synchronized Across Instances?</a> <a 
href="#What+is+NOT+Automatically+Synchronized+Across+Instances?"><img 
src="markbook-section-link.png"/></a></h5>
+<ul>
+  <li>Topologies (XML)</li>
+  <li>Gateway config (e.g., gateway-site, gateway-logging, etc&hellip;)</li>
+</ul><p>If you&rsquo;re creating/modifying topology XML files directly, then 
there is no automated support for keeping these in sync across Knox HA 
instances.</p><p>However, if the Knox instances are running in an Apache 
Ambari-managed cluster, there is limited support for keeping topology XML files 
and gateway configuration synchronized across those 
instances.</p><p><br></p><h4><a 
id="High+Availability+with+Apache+HTTP+Server+++mod_proxy+++mod_proxy_balancer">High
 Availability with Apache HTTP Server + mod_proxy + mod_proxy_balancer</a> <a 
href="#High+Availability+with+Apache+HTTP+Server+++mod_proxy+++mod_proxy_balancer"><img
 src="markbook-section-link.png"/></a></h4><h5><a id="1+-+Requirements">1 - 
Requirements</a> <a href="#1+-+Requirements"><img 
src="markbook-section-link.png"/></a></h5><h6><a 
id="openssl-devel">openssl-devel</a> <a href="#openssl-devel"><img 
src="markbook-section-link.png"/></a></h6><p>openssl-devel is required for 
Apache Module mod_ssl.</p>
 <pre><code>sudo yum install openssl-devel
 </code></pre><h6><a id="Apache+HTTP+Server">Apache HTTP Server</a> <a 
href="#Apache+HTTP+Server"><img 
src="markbook-section-link.png"/></a></h6><p>Apache HTTP Server 2.4.6 or later 
is required. See this document for installing and setting up Apache HTTP 
Server: <a 
href="http://httpd.apache.org/docs/2.4/install.html";>http://httpd.apache.org/docs/2.4/install.html</a></p><p>Hint:
 pass <code>--enable-ssl</code> to the <code>./configure</code> command to 
enable the generation of the Apache Module <em>mod_ssl</em>.</p><h6><a 
id="Apache+Module+mod_proxy">Apache Module mod_proxy</a> <a 
href="#Apache+Module+mod_proxy"><img 
src="markbook-section-link.png"/></a></h6><p>See this document for setting up 
Apache Module mod_proxy: <a 
href="http://httpd.apache.org/docs/2.4/mod/mod_proxy.html";>http://httpd.apache.org/docs/2.4/mod/mod_proxy.html</a></p><h6><a
 id="Apache+Module+mod_proxy_balancer">Apache Module mod_proxy_balancer</a> <a 
href="#Apache+Module+mod_proxy_balancer"><img src="markbook-sectio
 n-link.png"/></a></h6><p>See this document for setting up Apache Module 
mod_proxy_balancer: <a 
href="http://httpd.apache.org/docs/2.4/mod/mod_proxy_balancer.html";>http://httpd.apache.org/docs/2.4/mod/mod_proxy_balancer.html</a></p><h6><a
 id="Apache+Module+mod_ssl">Apache Module mod_ssl</a> <a 
href="#Apache+Module+mod_ssl"><img 
src="markbook-section-link.png"/></a></h6><p>See this document for setting up 
Apache Module mod_ssl: <a 
href="http://httpd.apache.org/docs/2.4/mod/mod_ssl.html";>http://httpd.apache.org/docs/2.4/mod/mod_ssl.html</a></p><h5><a
 id="2+-+Configuration+example">2 - Configuration example</a> <a 
href="#2+-+Configuration+example"><img 
src="markbook-section-link.png"/></a></h5><h6><a 
id="Generate+certificate+for+Apache+HTTP+Server">Generate certificate for 
Apache HTTP Server</a> <a 
href="#Generate+certificate+for+Apache+HTTP+Server"><img 
src="markbook-section-link.png"/></a></h6><p>See this document for an example: 
<a href="http://www.akadia.com/services/ssh_test_certif
 
icate.html">http://www.akadia.com/services/ssh_test_certificate.html</a></p><p>By
 convention, Apache HTTP Server and Knox certificates are put into 
/etc/apache2/ssl/ folder.</p><h6><a 
id="Update+Apache+HTTP+Server+configuration+file">Update Apache HTTP Server 
configuration file</a> <a 
href="#Update+Apache+HTTP+Server+configuration+file"><img 
src="markbook-section-link.png"/></a></h6><p>This file is located under 
{APACHE_HOME}/conf/httpd.conf.</p><p>Following directives have to be added or 
uncommented in the configuration file:</p>
 <ul>

Modified: knox/trunk/books/1.1.0/config_ha.md
URL: 
http://svn.apache.org/viewvc/knox/trunk/books/1.1.0/config_ha.md?rev=1834176&r1=1834175&r2=1834176&view=diff
==============================================================================
--- knox/trunk/books/1.1.0/config_ha.md (original)
+++ knox/trunk/books/1.1.0/config_ha.md Sat Jun 23 00:30:39 2018
@@ -19,18 +19,55 @@
 
 This describes how Knox itself can be made highly available.
 
-#### Configure Knox instances ####
-
 All Knox instances must be synced to use the same topology credential 
keystores.
 These files are located under 
`{GATEWAY_HOME}/conf/security/keystores/{TOPOLOGY_NAME}-credentials.jceks`.
 They are generated after the first topology deployment.
-Currently these files need to be synced manually.
-Here are the steps to sync topologies credentials keystores:
+
+In addition to these topology-specific credentials, gateway credentials and 
topologies must also be kept in-sync for Knox to operate in an HA manner.
+
+#### Manually Synchronize Knox Instances ####
+
+Here are the steps to manually sync topology credential keystores:
 
 1. Choose a Knox instance that will be the source for topology credential 
keystores. Let's call it _keystores master_
 2. Replace the topology credential keystores in the other Knox instances with 
topology credential keystores from the _keystores master_
 3. Restart Knox instances
 
+Manually synchronizing the gateway credentials and topologies involves using 
ssh/scp to copy the topology-related files to all the participating Knox 
instances, and running the Knox CLI on each participating instance to define 
the credential aliases.
+
+This manual process can be tedious and error-prone. As such, ZooKeeper-based 
HA is recommended to simplify the management of these deployments.
+
+#### High Availability with Apache ZooKeeper ####
+
+Rather than manually keeping Knox HA instances in sync (in terms of 
credentials and topology), Knox can get it's state from Apache ZooKeeper.
+By configuring all the Knox instances to monitor the same ZooKeeper ensemble, 
they can be kept in-sync by modifying the topology-related
+configuration and/or credential aliases at only one of the instances (using 
the Admin UI, Admin API, or Knox CLI).
+
+##### What is Automatically Synchronized Across Instances?
+
+* Provider Configurations
+* Descriptors
+* Credential Aliases
+
+When a provider configuration or descriptor is added or updated to the 
ZooKeeper ensemble, all of the participating Knox instances will get the 
change, and the affected topologies will be [re]generated and [re]deployed. 
Similarly, if one of these is deleted, the affected topologies will be deleted 
and undeployed.
+
+When provider configurations and descriptors are added, modified or removed 
using the Admin UI or API (when the Knox instance is configured to monitor a 
ZooKeeper ensemble), then those changes will be automatically reflected in the 
associated ZooKeeper ensemble. Those changes will subsequently be consumed by 
all the other Knox instances monitoring that ensemble.
+By using the Admin UI or API, ssh/scp access to the Knox hosts can be avoided 
completely for the purpose of effecting topology changes.
+
+Similarly, when the Knox CLI is used to create or delete a gateway alias (when 
the Knox instance is configured to monitor a ZooKeeper ensemble), that alias 
change is reflected in the ZooKeeper ensemble, and all other Knox instances 
montoring that ensemble will apply the change.
+
+
+##### What is NOT Automatically Synchronized Across Instances?
+
+* Topologies (XML)
+* Gateway config (e.g., gateway-site, gateway-logging, etc...)
+
+If you're creating/modifying topology XML files directly, then there is no 
automated support for keeping these in sync across Knox HA instances.
+
+However, if the Knox instances are running in an Apache Ambari-managed 
cluster, there is limited support for keeping topology XML files and gateway 
configuration synchronized across those instances.
+
+<br>
+
 #### High Availability with Apache HTTP Server + mod_proxy + 
mod_proxy_balancer ####
 
 ##### 1 - Requirements #####


Reply via email to