Updated latest site

Project: http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/commit/a876d178
Tree: 
http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/tree/a876d178
Diff: 
http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/diff/a876d178

Branch: refs/heads/asf-site
Commit: a876d1782a9a2b5380ed1729dcd3407d12e119fe
Parents: 02b2068
Author: Shwetha GS <[email protected]>
Authored: Mon Apr 25 09:13:18 2016 +0530
Committer: Shwetha GS <[email protected]>
Committed: Mon Apr 25 09:13:18 2016 +0530

----------------------------------------------------------------------
 Architecture.html                |  13 +-
 Bridge-Falcon.html               |  15 +-
 Bridge-Hive.html                 |  14 +-
 Bridge-Sqoop.html                |  10 +-
 Configuration.html               |  63 ++++++-
 HighAvailability.html            | 120 ++++++++++++--
 InstallationSteps.html           |  85 +++++++---
 Notification-Entity.html         |   6 +-
 QuickStart.html                  |   6 +-
 Repository.html                  |   6 +-
 Search.html                      |  14 +-
 Security.html                    |  11 +-
 StormAtlasHook.html              | 298 ++++++++++++++++++++++++++++++++++
 TypeSystem.html                  |  18 +-
 api/application.wadl             |  73 +++++++++
 api/resource_AdminResource.html  |  19 +++
 api/resource_EntityResource.html | 120 ++++++++++++++
 index.html                       |   9 +-
 issue-tracking.html              |   6 +-
 license.html                     |   6 +-
 mail-lists.html                  |   6 +-
 project-info.html                |   6 +-
 source-repository.html           |  12 +-
 team-list.html                   |  55 ++++---
 24 files changed, 851 insertions(+), 140 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/blob/a876d178/Architecture.html
----------------------------------------------------------------------
diff --git a/Architecture.html b/Architecture.html
index 461d55c..480aae0 100644
--- a/Architecture.html
+++ b/Architecture.html
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2016-01-05
+ | Generated by Apache Maven Doxia at 2016-04-25
  | Rendered using Apache Maven Fluido Skin 1.3.0
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20160105" />
+    <meta name="Date-Revision-yyyymmdd" content="20160425" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache Atlas &#x2013; Architecture</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
@@ -189,7 +189,7 @@
         
                 
                     
-                  <li id="publishDate" class="pull-right">Last Published: 
2016-01-05</li> <li class="divider pull-right">|</li>
+                  <li id="publishDate" class="pull-right">Last Published: 
2016-04-25</li> <li class="divider pull-right">|</li>
               <li id="projectVersion" class="pull-right">Version: 
0.7-incubating-SNAPSHOT</li>
             
                             </ul>
@@ -217,17 +217,18 @@
 <li><b>Notification Server</b>: Atlas uses Apache Kafka as a notification 
server for communication between hooks and downstream consumers of metadata 
notification events. Events are written by the hooks and Atlas to different 
Kafka topics. Kafka enables a loosely coupled integration between these 
disparate systems.</li></ul></div>
 <div class="section">
 <h3><a name="Bridges"></a>Bridges</h3>
-<p>External components like hive/sqoop/storm/falcon should model their 
taxonomy using typesystem and register the types with Atlas. For every entity 
created in this external component, the corresponding entity should be 
registered in Atlas as well. This is typically done in a hook which runs in the 
external component and is called for every entity operation. Hook generally 
processes the entity asynchronously using a thread pool to avoid adding latency 
to the main operation. The hook can then build the entity and register the 
entity using Atlas REST APIs. Howerver, any failure in APIs because of network 
issue etc can in result entity not registered in Atlas and hence inconsistent 
metadata.</p>
+<p>External components like hive/sqoop/storm/falcon should model their 
taxonomy using typesystem and register the types with Atlas. For every entity 
created in this external component, the corresponding entity should be 
registered in Atlas as well. This is typically done in a hook which runs in the 
external component and is called for every entity operation. Hook generally 
processes the entity asynchronously using a thread pool to avoid adding latency 
to the main operation. The hook can then build the entity and register the 
entity using Atlas REST APIs. Howerver, any failure in APIs because of network 
issue etc can result in entity not registered in Atlas and hence inconsistent 
metadata.</p>
 <p>Atlas exposes notification interface and can be used for reliable entity 
registration by hook as well. The hook can send notification message containing 
the list of entities to be registered.  Atlas service contains hook consumer 
that listens to these messages and registers the entities.</p>
 <p>Available bridges are:</p>
 <ul>
 <li><a href="./Bridge-Hive.html">Hive Bridge</a></li>
 <li><a href="./Bridge-Sqoop.html">Sqoop Bridge</a></li>
-<li><a href="./Bridge-Falcon.html">Falcon Bridge</a></li></ul></div>
+<li><a href="./Bridge-Falcon.html">Falcon Bridge</a></li>
+<li><a href="./StormAtlasHook.html">Storm Bridge</a></li></ul></div>
 <div class="section">
 <h3><a name="Notification"></a>Notification</h3>
 <p>Notification is used for reliable entity registration from hooks and for 
entity/type change notifications. Atlas, by default, provides Kafka 
integration, but its possible to provide other implementations as well. Atlas 
service starts embedded Kafka server by default.</p>
-<p>Atlas also provides <a 
href="./NotificationHookConsumer.html">NotificationHookConsumer</a> that runs 
in Atlas Service and listens to messages from hook and registers the entities 
in Atlas. <img src="images/twiki/notification.png" alt="" /></p></div>
+<p>Atlas also provides NotificationHookConsumer that runs in Atlas Service and 
listens to messages from hook and registers the entities in Atlas. <img 
src="images/twiki/notification.png" alt="" /></p></div>
                   </div>
           </div>
 

http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/blob/a876d178/Bridge-Falcon.html
----------------------------------------------------------------------
diff --git a/Bridge-Falcon.html b/Bridge-Falcon.html
index bcf43c9..df7f952 100644
--- a/Bridge-Falcon.html
+++ b/Bridge-Falcon.html
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2016-01-05
+ | Generated by Apache Maven Doxia at 2016-04-25
  | Rendered using Apache Maven Fluido Skin 1.3.0
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20160105" />
+    <meta name="Date-Revision-yyyymmdd" content="20160425" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache Atlas &#x2013; Falcon Atlas Bridge</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
@@ -189,7 +189,7 @@
         
                 
                     
-                  <li id="publishDate" class="pull-right">Last Published: 
2016-01-05</li> <li class="divider pull-right">|</li>
+                  <li id="publishDate" class="pull-right">Last Published: 
2016-04-25</li> <li class="divider pull-right">|</li>
               <li id="projectVersion" class="pull-right">Version: 
0.7-incubating-SNAPSHOT</li>
             
                             </ul>
@@ -219,8 +219,13 @@ falcon_process(ClassType) - super types [Process] - 
attributes [timestamp, owned
 <ul>
 <li>Add 'org.apache.falcon.atlas.service.AtlasService' to application.services 
in &lt;falcon-conf&gt;/startup.properties</li>
 <li>Link falcon hook jars in falcon classpath - 'ln -s 
&lt;atlas-home&gt;/hook/falcon/* 
&lt;falcon-home&gt;/server/webapp/falcon/WEB-INF/lib/'</li>
-<li>Copy &lt;atlas-conf&gt;/client.properties and 
&lt;atlas-conf&gt;/atlas-application.properties to the falcon conf 
directory.</li></ul>
-<p>The following properties in &lt;atlas-conf&gt;/client.properties control 
the thread pool and notification details:</p>
+<li>In &lt;falcon_conf&gt;/falcon-env.sh, set an environment variable as 
follows:</li></ul>
+<div class="source">
+<pre>
+     export FALCON_SERVER_OPTS=&quot;$FALCON_SERVER_OPTS 
-Datlas.conf=&lt;atlas-conf&gt;&quot;
+     
+</pre></div>
+<p>The following properties in &lt;atlas-conf&gt;/atlas-application.properties 
control the thread pool and notification details:</p>
 <ul>
 <li>atlas.hook.falcon.synchronous - boolean, true to run the hook 
synchronously. default false</li>
 <li>atlas.hook.falcon.numRetries - number of retries for notification failure. 
default 3</li>

http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/blob/a876d178/Bridge-Hive.html
----------------------------------------------------------------------
diff --git a/Bridge-Hive.html b/Bridge-Hive.html
index c725b1a..95d391a 100644
--- a/Bridge-Hive.html
+++ b/Bridge-Hive.html
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2016-01-05
+ | Generated by Apache Maven Doxia at 2016-04-25
  | Rendered using Apache Maven Fluido Skin 1.3.0
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20160105" />
+    <meta name="Date-Revision-yyyymmdd" content="20160425" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache Atlas &#x2013; Hive Atlas Bridge</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
@@ -189,7 +189,7 @@
         
                 
                     
-                  <li id="publishDate" class="pull-right">Last Published: 
2016-01-05</li> <li class="divider pull-right">|</li>
+                  <li id="publishDate" class="pull-right">Last Published: 
2016-04-25</li> <li class="divider pull-right">|</li>
               <li id="projectVersion" class="pull-right">Version: 
0.7-incubating-SNAPSHOT</li>
             
                             </ul>
@@ -231,7 +231,7 @@ hive_process(ClassType) - super types [Process] - 
attributes [startTime, endTime
 <li>hive_process - attribute name - &lt;queryString&gt; - trimmed query string 
in lower case</li></ul></div>
 <div class="section">
 <h3><a name="Importing_Hive_Metadata"></a>Importing Hive Metadata</h3>
-<p>org.apache.atlas.hive.bridge.HiveMetaStoreBridge imports the hive metadata 
into Atlas using the model defined in 
org.apache.atlas.hive.model.HiveDataModelGenerator. import-hive.sh command can 
be used to facilitate this. Set the following configuration in 
&lt;atlas-conf&gt;/client.properties and set environment variable 
$HIVE_CONF_DIR to the hive conf directory:</p>
+<p>org.apache.atlas.hive.bridge.HiveMetaStoreBridge imports the hive metadata 
into Atlas using the model defined in 
org.apache.atlas.hive.model.HiveDataModelGenerator. import-hive.sh command can 
be used to facilitate this. Set the following configuration in 
&lt;atlas-conf&gt;/atlas-application.properties and set environment variable 
$HIVE_CONF_DIR to the hive conf directory:</p>
 <div class="source">
 <pre>
     &lt;property&gt;
@@ -270,8 +270,8 @@ hive_process(ClassType) - super types [Process] - 
attributes [startTime, endTime
 <p></p>
 <ul>
 <li>Add 'export HIVE_AUX_JARS_PATH=&lt;atlas package&gt;/hook/hive' in 
hive-env.sh of your hive configuration</li>
-<li>Copy &lt;atlas-conf&gt;/client.properties and 
&lt;atlas-conf&gt;/atlas-application.properties to the hive conf 
directory.</li></ul>
-<p>The following properties in &lt;atlas-conf&gt;/client.properties control 
the thread pool and notification details:</p>
+<li>Copy &lt;atlas-conf&gt;/atlas-application.properties to the hive conf 
directory.</li></ul>
+<p>The following properties in &lt;atlas-conf&gt;/atlas-application.properties 
control the thread pool and notification details:</p>
 <ul>
 <li>atlas.hook.hive.synchronous - boolean, true to run the hook synchronously. 
default false</li>
 <li>atlas.hook.hive.numRetries - number of retries for notification failure. 
default 3</li>
@@ -285,7 +285,7 @@ hive_process(ClassType) - super types [Process] - 
attributes [startTime, endTime
 <p></p>
 <ul>
 <li>Since database name, table name and column names are case insensitive in 
hive, the corresponding names in entities are lowercase. So, any search APIs 
should use lowercase while querying on the entity names</li>
-<li>Only the following hive operations are captured by hive hook currently - 
create database, create table, create view, CTAS, load, import, export, query, 
alter table rename and alter view rename</li></ul></div>
+<li>Only the following hive operations are captured by hive hook currently - 
create database, create table, create view, CTAS, load, import, export, query, 
alter database, alter table(except alter table replace columns and alter table 
change column position), alter view (except replacing and changing column 
position)</li></ul></div>
                   </div>
           </div>
 

http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/blob/a876d178/Bridge-Sqoop.html
----------------------------------------------------------------------
diff --git a/Bridge-Sqoop.html b/Bridge-Sqoop.html
index 9fa6414..7e9764f 100644
--- a/Bridge-Sqoop.html
+++ b/Bridge-Sqoop.html
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2016-01-05
+ | Generated by Apache Maven Doxia at 2016-04-25
  | Rendered using Apache Maven Fluido Skin 1.3.0
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20160105" />
+    <meta name="Date-Revision-yyyymmdd" content="20160425" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache Atlas &#x2013; Sqoop Atlas Bridge</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
@@ -189,7 +189,7 @@
         
                 
                     
-                  <li id="publishDate" class="pull-right">Last Published: 
2016-01-05</li> <li class="divider pull-right">|</li>
+                  <li id="publishDate" class="pull-right">Last Published: 
2016-04-25</li> <li class="divider pull-right">|</li>
               <li id="projectVersion" class="pull-right">Version: 
0.7-incubating-SNAPSHOT</li>
             
                             </ul>
@@ -215,14 +215,14 @@ sqoop_dbdatastore(ClassType) - super types [DataSet] - 
attributes [name, dbStore
 <p>The entities are created and de-duped using unique qualified name. They 
provide namespace and can be used for querying as well: sqoop_process - 
attribute name - sqoop-dbStoreType-storeUri-endTime sqoop_dbdatastore - 
attribute name - dbStoreType-connectorUrl-source</p></div>
 <div class="section">
 <h3><a name="Sqoop_Hook"></a>Sqoop Hook</h3>
-<p>Sqoop added a <a 
href="./SqoopJobDataPublisher.html">SqoopJobDataPublisher</a> that publishes 
data to Atlas after completion of import Job. Today, only hiveImport is 
supported in sqoopHook. This is used to add entities in Atlas using the model 
defined in org.apache.atlas.sqoop.model.SqoopDataModelGenerator. Follow these 
instructions in your sqoop set-up to add sqoop hook for Atlas in 
&lt;sqoop-conf&gt;/sqoop-site.xml:</p>
+<p>Sqoop added a SqoopJobDataPublisher that publishes data to Atlas after 
completion of import Job. Today, only hiveImport is supported in sqoopHook. 
This is used to add entities in Atlas using the model defined in 
org.apache.atlas.sqoop.model.SqoopDataModelGenerator. Follow these instructions 
in your sqoop set-up to add sqoop hook for Atlas in 
&lt;sqoop-conf&gt;/sqoop-site.xml:</p>
 <p></p>
 <ul>
 <li>Sqoop Job publisher class.  Currently only one publishing class is 
supported</li></ul><property>      <name>sqoop.job.data.publish.class</name>    
  <value>org.apache.atlas.sqoop.hook.SqoopHook</value>    </property>
 <ul>
 <li>Atlas cluster name</li></ul><property>      
<name>atlas.cluster.name</name>      <value><clustername></value>    </property>
 <ul>
-<li>Copy &lt;atlas-conf&gt;/atlas-application.properties and 
&lt;atlas-conf&gt;/client.properties to to the sqoop conf directory 
&lt;sqoop-conf&gt;/</li>
+<li>Copy &lt;atlas-conf&gt;/atlas-application.properties to to the sqoop conf 
directory &lt;sqoop-conf&gt;/</li>
 <li>Link &lt;atlas-home&gt;/hook/sqoop/*.jar in sqoop lib</li></ul>
 <p>Refer <a href="./Configuration.html">Configuration</a> for notification 
related configurations</p></div>
 <div class="section">

http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/blob/a876d178/Configuration.html
----------------------------------------------------------------------
diff --git a/Configuration.html b/Configuration.html
index eecc8e7..b8f74c5 100644
--- a/Configuration.html
+++ b/Configuration.html
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2016-01-05
+ | Generated by Apache Maven Doxia at 2016-04-25
  | Rendered using Apache Maven Fluido Skin 1.3.0
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20160105" />
+    <meta name="Date-Revision-yyyymmdd" content="20160425" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache Atlas &#x2013; Configuring Apache Atlas - Application 
Properties</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
@@ -189,7 +189,7 @@
         
                 
                     
-                  <li id="publishDate" class="pull-right">Last Published: 
2016-01-05</li> <li class="divider pull-right">|</li>
+                  <li id="publishDate" class="pull-right">Last Published: 
2016-04-25</li> <li class="divider pull-right">|</li>
               <li id="projectVersion" class="pull-right">Version: 
0.7-incubating-SNAPSHOT</li>
             
                             </ul>
@@ -243,7 +243,8 @@ zookeeper.znode.parent=/hbase-unsecure
    kinit -k -t &lt;hbase keytab&gt; &lt;hbase principal&gt;
    echo &quot;grant 'atlas', 'RWXCA', 'titan'&quot; | hbase shell
 
-</pre></div></div>
+</pre></div>
+<p>Note that HBase is included in the distribution so that a standalone 
instance of HBase can be started as the default storage backend for the graph 
repository.</p></div>
 <div class="section">
 <h4><a name="Graph_Search_Index"></a>Graph Search Index</h4>
 <p>This section sets up the graph db - titan - to use an search indexing 
system. The example configuration below sets up to use an embedded Elastic 
search indexing system.</p>
@@ -274,10 +275,10 @@ atlas.graph.index.search.elasticsearch.create.sleep=2000
 <p>Refer <a class="externalLink" 
href="http://s3.thinkaurelius.com/docs/titan/0.5.4/bdb.html";>http://s3.thinkaurelius.com/docs/titan/0.5.4/bdb.html</a>
 and <a class="externalLink" 
href="http://s3.thinkaurelius.com/docs/titan/0.5.4/hbase.html";>http://s3.thinkaurelius.com/docs/titan/0.5.4/hbase.html</a>
 for choosing between the persistence backends. BerkeleyDB is suitable for 
smaller data sets in the range of upto 10 million vertices with ACID gurantees. 
HBase on the other hand doesnt provide ACID guarantees but is able to scale for 
larger graphs. HBase also provides HA inherently.</p></div>
 <div class="section">
 <h4><a name="Choosing_between_Indexing_Backends"></a>Choosing between Indexing 
Backends</h4>
-<p>Refer <a class="externalLink" 
href="http://s3.thinkaurelius.com/docs/titan/0.5.4/elasticsearch.html";>http://s3.thinkaurelius.com/docs/titan/0.5.4/elasticsearch.html</a>
 and <a class="externalLink" 
href="http://s3.thinkaurelius.com/docs/titan/0.5.4/solr.html";>http://s3.thinkaurelius.com/docs/titan/0.5.4/solr.html</a>
 for chossing between <a href="./ElasticSarch.html">ElasticSarch</a> and Solr. 
Solr in cloud mode is the recommended setup.</p></div>
+<p>Refer <a class="externalLink" 
href="http://s3.thinkaurelius.com/docs/titan/0.5.4/elasticsearch.html";>http://s3.thinkaurelius.com/docs/titan/0.5.4/elasticsearch.html</a>
 and <a class="externalLink" 
href="http://s3.thinkaurelius.com/docs/titan/0.5.4/solr.html";>http://s3.thinkaurelius.com/docs/titan/0.5.4/solr.html</a>
 for choosing between ElasticSearch and Solr. Solr in cloud mode is the 
recommended setup.</p></div>
 <div class="section">
 <h4><a name="Switching_Persistence_Backend"></a>Switching Persistence 
Backend</h4>
-<p>For switching the storage backend from BerkeleyDB to HBase and vice versa, 
refer the documentation for &quot;Graph Persistence Engine&quot; described 
above and restart ATLAS. The data in the indexing backend needs to be cleared 
else there will be discrepancies between the storage and indexing backend which 
could result in errors during the search. <a 
href="./ElasticSearch.html">ElasticSearch</a> runs by default in embedded mode 
and the data could easily be cleared by deleting the ATLAS_HOME/data/es 
directory. For Solr, the collections which were created during ATLAS 
Installation - vertex_index, edge_index, fulltext_index could be deleted which 
will cleanup the indexes</p></div>
+<p>For switching the storage backend from BerkeleyDB to HBase and vice versa, 
refer the documentation for &quot;Graph Persistence Engine&quot; described 
above and restart ATLAS. The data in the indexing backend needs to be cleared 
else there will be discrepancies between the storage and indexing backend which 
could result in errors during the search. ElasticSearch runs by default in 
embedded mode and the data could easily be cleared by deleting the 
ATLAS_HOME/data/es directory. For Solr, the collections which were created 
during ATLAS Installation - vertex_index, edge_index, fulltext_index could be 
deleted which will cleanup the indexes</p></div>
 <div class="section">
 <h4><a name="Switching_Index_Backend"></a>Switching Index Backend</h4>
 <p>Switching the Index backend requires clearing the persistence backend data. 
Otherwise there will be discrepancies between the persistence and index 
backends since switching the indexing backend means index data will be lost. 
This leads to &quot;Fulltext&quot; queries not working on the existing data For 
clearing the data for BerkeleyDB, delete the ATLAS_HOME/data/berkeley directory 
For clearing the data for HBase, in Hbase shell, run 'disable titan' and 'drop 
titan'</p></div>
@@ -336,6 +337,56 @@ 
atlas.rest.address=&lt;http/https&gt;://&lt;atlas-fqdn&gt;:&lt;atlas port&gt; -
 atlas.enableTLS=false
 
 </pre></div></div>
+<div class="section">
+<h3><a name="High_Availability_Properties"></a>High Availability 
Properties</h3>
+<p>The following properties describe High Availability related configuration 
options:</p>
+<div class="source">
+<pre>
+# Set the following property to true, to enable High Availability. Default = 
false.
+atlas.server.ha.enabled=true
+
+# Define a unique set of strings to identify each instance that should run an 
Atlas Web Service instance as a comma separated list.
+atlas.server.ids=id1,id2
+# For each string defined above, define the host and port on which Atlas 
server binds to.
+atlas.server.address.id1=host1.company.com:21000
+atlas.server.address.id2=host2.company.com:31000
+
+# Specify Zookeeper properties needed for HA.
+# Specify the list of services running Zookeeper servers as a comma separated 
list.
+atlas.server.ha.zookeeper.connect=zk1.company.com:2181,zk2.company.com:2181,zk3.company.com:2181
+# Specify how many times should connection try to be established with a 
Zookeeper cluster, in case of any connection issues.
+atlas.server.ha.zookeeper.num.retries=3
+# Specify how much time should the server wait before attempting connections 
to Zookeeper, in case of any connection issues.
+atlas.server.ha.zookeeper.retry.sleeptime.ms=1000
+# Specify how long a session to Zookeeper should last without inactiviy to be 
deemed as unreachable.
+atlas.server.ha.zookeeper.session.timeout.ms=20000
+
+# Specify the scheme and the identity to be used for setting up ACLs on nodes 
created in Zookeeper for HA.
+# The format of these options is &lt;scheme&gt;:&lt;identity&gt;. For more 
information refer to 
http://zookeeper.apache.org/doc/r3.2.2/zookeeperProgrammers.html#sc_ZooKeeperAccessControl.
+# The 'acl' option allows to specify a scheme, identity pair to setup an ACL 
for.
+atlas.server.ha.zookeeper.acl=auth:sasl:[email protected]
+# The 'auth' option specifies the authentication that should be used for 
connecting to Zookeeper.
+atlas.server.ha.zookeeper.auth=sasl:[email protected]
+
+# Since Zookeeper is a shared service that is typically used by many 
components,
+# it is preferable for each component to set its znodes under a namespace.
+# Specify the namespace under which the znodes should be written. Default = 
/apache_atlas
+atlas.server.ha.zookeeper.zkroot=/apache_atlas
+
+# Specify number of times a client should retry with an instance before 
selecting another active instance, or failing an operation.
+atlas.client.ha.retries=4
+# Specify interval between retries for a client.
+atlas.client.ha.sleep.interval.ms=5000
+
+</pre></div></div>
+<div class="section">
+<h3><a name="Server_Properties"></a>Server Properties</h3>
+<div class="source">
+<pre>
+# Set the following property to true, to enable the setup steps to run on each 
server start. Default = false.
+atlas.server.run.setup.on.start=false
+
+</pre></div></div>
                   </div>
           </div>
 

http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/blob/a876d178/HighAvailability.html
----------------------------------------------------------------------
diff --git a/HighAvailability.html b/HighAvailability.html
index 56cb376..f91e088 100644
--- a/HighAvailability.html
+++ b/HighAvailability.html
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2016-01-05
+ | Generated by Apache Maven Doxia at 2016-04-25
  | Rendered using Apache Maven Fluido Skin 1.3.0
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20160105" />
+    <meta name="Date-Revision-yyyymmdd" content="20160425" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache Atlas &#x2013; Fault Tolerance and High Availability 
Options</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
@@ -189,7 +189,7 @@
         
                 
                     
-                  <li id="publishDate" class="pull-right">Last Published: 
2016-01-05</li> <li class="divider pull-right">|</li>
+                  <li id="publishDate" class="pull-right">Last Published: 
2016-04-25</li> <li class="divider pull-right">|</li>
               <li id="projectVersion" class="pull-right">Version: 
0.7-incubating-SNAPSHOT</li>
             
                             </ul>
@@ -203,32 +203,121 @@
 <h2><a name="Fault_Tolerance_and_High_Availability_Options"></a>Fault 
Tolerance and High Availability Options</h2></div>
 <div class="section">
 <h3><a name="Introduction"></a>Introduction</h3>
-<p>Apache Atlas uses and interacts with a variety of systems to provide 
metadata management and data lineage to data administrators. By choosing and 
configuring these dependencies appropriately, it is possible to achieve a good 
degree of service availability with Atlas. This document describes the state of 
high availability support in Atlas, including its capabilities and current 
limitations, and also the configuration required for achieving a this level of 
high availability.</p>
+<p>Apache Atlas uses and interacts with a variety of systems to provide 
metadata management and data lineage to data administrators. By choosing and 
configuring these dependencies appropriately, it is possible to achieve a high 
degree of service availability with Atlas. This document describes the state of 
high availability support in Atlas, including its capabilities and current 
limitations, and also the configuration required for achieving this level of 
high availability.</p>
 <p><a href="./Architecture.html">The architecture page</a> in the wiki gives 
an overview of the various components that make up Atlas. The options mentioned 
below for various components derive context from the above page, and would be 
worthwhile to review before proceeding to read this page.</p></div>
 <div class="section">
 <h3><a name="Atlas_Web_Service"></a>Atlas Web Service</h3>
-<p>Currently, the Atlas Web service has a limitation that it can only have one 
active instance at a time. Therefore, in case of errors to the host running the 
service, a new Atlas web service instance should be brought up and pointed to 
from the clients. In future versions of the system, we plan to provide full 
High Availability of the service, thereby enabling hot failover. To minimize 
service loss, we recommend the following:</p>
+<p>Currently, the Atlas Web Service has a limitation that it can only have one 
active instance at a time. In earlier releases of Atlas, a backup instance 
could be provisioned and kept available. However, a manual failover was 
required to make this backup instance active.</p>
+<p>From this release, Atlas will support multiple instances of the Atlas Web 
service in an active/passive configuration with automated failover. This means 
that users can deploy and start multiple instances of the Atlas Web Service on 
different physical hosts at the same time. One of these instances will be 
automatically selected as an 'active' instance to service user requests. The 
others will automatically be deemed 'passive'. If the 'active' instance becomes 
unavailable either because it is deliberately stopped, or due to unexpected 
failures, one of the other instances will automatically be elected as an 
'active' instance and start to service user requests.</p>
+<p>An 'active' instance is the only instance that can respond to user requests 
correctly. It can create, delete, modify or respond to queries on metadata 
objects. A 'passive' instance will accept user requests, but will redirect them 
using HTTP redirect to the currently known 'active' instance. Specifically, a 
passive instance will not itself respond to any queries on metadata objects. 
However, all instances (both active and passive), will respond to admin 
requests that return information about that instance.</p>
+<p>When configured in a High Availability mode, users can get the following 
operational benefits:</p>
 <p></p>
 <ul>
-<li>An extra physical host with the Atlas system software and configuration is 
available to be brought up on demand.</li>
-<li>It would be convenient to have the web service fronted by a proxy solution 
like <a class="externalLink" 
href="https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#5.2";>HAProxy</a>
 which can be used to provide both the monitoring and transparent switching of 
the backend instance clients talk to.
+<li><b>Uninterrupted service during maintenance intervals</b>: If an active 
instance of the Atlas Web Service needs to be brought down for maintenance, 
another instance would automatically become active and can service 
requests.</li>
+<li><b>Uninterrupted service in event of unexpected failures</b>: If an active 
instance of the Atlas Web Service fails due to software or hardware errors, 
another instance would automatically become active and can service 
requests.</li></ul>
+<p>In the following sub-sections, we describe the steps required to setup High 
Availability for the Atlas Web Service. We also describe how the deployment and 
client can be designed to take advantage of this capability. Finally, we 
describe a few details of the underlying implementation.</p></div>
+<div class="section">
+<h4><a name="Setting_up_the_High_Availability_feature_in_Atlas"></a>Setting up 
the High Availability feature in Atlas</h4>
+<p>The following pre-requisites must be met for setting up the High 
Availability feature.</p>
+<p></p>
+<ul>
+<li>Ensure that you install Apache Zookeeper on a cluster of machines (a 
minimum of 3 servers is recommended for production).</li>
+<li>Select 2 or more physical machines to run the Atlas Web Service instances 
on. These machines define what we refer to as a 'server ensemble' for 
Atlas.</li></ul>
+<p>To setup High Availability in Atlas, a few configuration options must be 
defined in the <tt>atlas-application.properties</tt> file. While the complete 
list of configuration items are defined in the <a 
href="./Configuration.html">Configuration Page</a>, this section lists a few of 
the main options.</p>
+<p></p>
+<ul>
+<li>High Availability is an optional feature in Atlas. Hence, it must be 
enabled by setting the configuration option <tt>atlas.server.ha.enabled</tt> to 
true.</li>
+<li>Next, define a list of identifiers, one for each physical machine you have 
selected for the Atlas Web Service instance. These identifiers can be simple 
strings like <tt>id1</tt>, <tt>id2</tt> etc. They should be unique and should 
not contain a comma.</li>
+<li>Define a comma separated list of these identifiers as the value of the 
option <tt>atlas.server.ids</tt>.</li>
+<li>For each physical machine, list the IP Address/hostname and port as the 
value of the configuration <tt>atlas.server.address.id</tt>, where <tt>id</tt> 
refers to the identifier string for this physical machine.
+<ul>
+<li>For e.g., if you have selected 2 machines with hostnames 
<tt>host1.company.com</tt> and <tt>host2.company.com</tt>, you can define the 
configuration options as below:</li></ul></li></ul>
+<div class="source">
+<pre>
+      atlas.server.ids=id1,id2
+      atlas.server.address.id1=host1.company.com:21000
+      atlas.server.address.id2=host2.company.com:21000
+      
+</pre></div>
+<p></p>
 <ul>
-<li>An example HAProxy configuration of this form will allow a transparent 
failover to a backup server:</li></ul></li></ul>
+<li>Define the Zookeeper quorum which will be used by the Atlas High 
Availability feature.</li></ul>
 <div class="source">
 <pre>
-      listen atlas
-        bind &lt;proxy hostname&gt;:&lt;proxy port&gt;
-        balance roundrobin
-        server inst1 &lt;atlas server hostname&gt;:&lt;port&gt; check
-        server inst2 &lt;atlas backup server hostname&gt;:&lt;port&gt; check 
backup
+      
atlas.server.ha.zookeeper.connect=zk1.company.com:2181,zk2.company.com:2181,zk3.company.com:2181
       
 </pre></div>
 <p></p>
 <ul>
-<li>The stores that hold Atlas data can be configured to be highly available 
as described below.</li></ul></div>
+<li>You can review other configuration options that are defined for the High 
Availability feature, and set them up as desired in the 
<tt>atlas-application.properties</tt> file.</li>
+<li>For production environments, the components that Atlas depends on must 
also be set up in High Availability mode. This is described in detail in the 
following sections. Follow those instructions to setup and configure them.</li>
+<li>Install the Atlas software on the selected physical machines.</li>
+<li>Copy the <tt>atlas-application.properties</tt> file created using the 
steps above to the configuration directory of all the machines.</li>
+<li>Start the dependent components.</li>
+<li>Start each instance of the Atlas Web Service.</li></ul>
+<p>To verify that High Availability is working, run the following script on 
each of the instances where Atlas Web Service is installed.</p>
+<div class="source">
+<pre>
+$ATLAS_HOME/bin/atlas_admin.py -status
+
+</pre></div>
+<p>This script can print one of the values below as response:</p>
+<p></p>
+<ul>
+<li><b>ACTIVE</b>: This instance is active and can respond to user 
requests.</li>
+<li><b>PASSIVE</b>: This instance is PASSIVE. It will redirect any user 
requests it receives to the current active instance.</li>
+<li><b>BECOMING_ACTIVE</b>: This would be printed if the server is 
transitioning to become an ACTIVE instance. The server cannot service any 
metadata user requests in this state.</li>
+<li><b>BECOMING_PASSIVE</b>: This would be printed if the server is 
transitioning to become a PASSIVE instance. The server cannot service any 
metadata user requests in this state.</li></ul>
+<p>Under normal operating circumstances, only one of these instances should 
print the value <b>ACTIVE</b> as response to the script, and the others would 
print <b>PASSIVE</b>.</p></div>
+<div class="section">
+<h4><a 
name="Configuring_clients_to_use_the_High_Availability_feature"></a>Configuring 
clients to use the High Availability feature</h4>
+<p>The Atlas Web Service can be accessed in two ways:</p>
+<p></p>
+<ul>
+<li><b>Using the Atlas Web UI</b>: This is a browser based client that can be 
used to query the metadata stored in Atlas.</li>
+<li><b>Using the Atlas REST API</b>: As Atlas exposes a RESTful API, one can 
use any standard REST client including libraries in other applications. In 
fact, Atlas ships with a client called AtlasClient that can be used as an 
example to build REST client access.</li></ul>
+<p>In order to take advantage of the High Availability feature in the clients, 
there are two options possible.</p></div>
+<div class="section">
+<h5><a name="Using_an_intermediate_proxy"></a>Using an intermediate proxy</h5>
+<p>The simplest solution to enable highly available access to Atlas is to 
install and configure some intermediate proxy that has a capability to 
transparently switch services based on status. One such proxy solution is <a 
class="externalLink" href="http://www.haproxy.org/";>HAProxy</a>.</p>
+<p>Here is an example HAProxy configuration that can be used. Note this is 
provided for illustration only, and not as a recommended production 
configuration. For that, please refer to the HAProxy documentation for 
appropriate instructions.</p>
+<div class="source">
+<pre>
+frontend atlas_fe
+  bind *:41000
+  default_backend atlas_be
+
+backend atlas_be
+  mode http
+  option httpchk get /api/atlas/admin/status
+  http-check expect string ACTIVE
+  balance roundrobin
+  server host1_21000 host1:21000 check
+  server host2_21000 host2:21000 check backup
+
+listen atlas
+  bind localhost:42000
+
+</pre></div>
+<p>The above configuration binds HAProxy to listen on port 41000 for incoming 
client connections. It then routes the connections to either of the hosts host1 
or host2 depending on a HTTP status check. The status check is done using a 
HTTP GET on the REST URL <tt>/api/atlas/admin/status</tt>, and is deemed 
successful only if the HTTP response contains the string ACTIVE.</p></div>
+<div class="section">
+<h5><a name="Using_automatic_detection_of_active_instance"></a>Using automatic 
detection of active instance</h5>
+<p>If one does not want to setup and manage a separate proxy, then the other 
option to use the High Availability feature is to build a client application 
that is capable of detecting status and retrying operations. In such a setting, 
the client application can be launched with the URLs of all Atlas Web Service 
instances that form the ensemble. The client should then call the REST URL 
<tt>/api/atlas/admin/status</tt> on each of these to determine which is the 
active instance. The response from the Active instance would be of the form 
<tt>{Status:ACTIVE}</tt>. Also, when the client faces any exceptions in the 
course of an operation, it should again determine which of the remaining URLs 
is active and retry the operation.</p>
+<p>The AtlasClient class that ships with Atlas can be used as an example 
client library that implements the logic for working with an ensemble and 
selecting the right Active server instance.</p>
+<p>Utilities in Atlas, like <tt>quick_start.py</tt> and 
<tt>import-hive.sh</tt> can be configured to run with multiple server URLs. 
When launched in this mode, the AtlasClient automatically selects and works 
with the current active instance. If a proxy is set up in between, then its 
address can be used when running quick_start.py or import-hive.sh.</p></div>
+<div class="section">
+<h4><a 
name="Implementation_Details_of_Atlas_High_Availability"></a>Implementation 
Details of Atlas High Availability</h4>
+<p>The Atlas High Availability work is tracked under the master JIRA <a 
class="externalLink" 
href="https://issues.apache.org/jira/browse/ATLAS-510";>ATLAS-510</a>. The JIRAs 
filed under it have detailed information about how the High Availability 
feature has been implemented. At a high level the following points can be 
called out:</p>
+<p></p>
+<ul>
+<li>The automatic selection of an Active instance, as well as automatic 
failover to a new Active instance happen through a leader election 
algorithm.</li>
+<li>For leader election, we use the <a class="externalLink" 
href="http://curator.apache.org/curator-recipes/leader-latch.html";>Leader Latch 
Recipe</a> of <a class="externalLink" href="http://curator.apache.org";>Apache 
Curator</a>.</li>
+<li>The Active instance is the only one which initializes, modifies or reads 
state in the backend stores to keep them consistent.</li>
+<li>Also, when an instance is elected as Active, it refreshes any cached 
information from the backend stores to get up to date.</li>
+<li>A servlet filter ensures that only the active instance services user 
requests. If a passive instance receives these requests, it automatically 
redirects them to the current active instance.</li></ul></div>
 <div class="section">
 <h3><a name="Metadata_Store"></a>Metadata Store</h3>
-<p>As described above, Atlas uses Titan to store the metadata it manages. By 
default, Titan uses BerkeleyDB as an embedded backing store. However, this 
option would result in loss of data if the node running the Atlas server fails. 
In order to provide HA for the metadata store, we recommend that Atlas be 
configured to use HBase as the backing store for Titan. Doing this implies that 
you could benefit from the HA guarantees HBase provides. In order to configure 
Atlas to use HBase in HA mode, do the following:</p>
+<p>As described above, Atlas uses Titan to store the metadata it manages. By 
default, Atlas uses a standalone HBase instance as the backing store for Titan. 
In order to provide HA for the metadata store, we recommend that Atlas be 
configured to use distributed HBase as the backing store for Titan.  Doing this 
implies that you could benefit from the HA guarantees HBase provides. In order 
to configure Atlas to use HBase in HA mode, do the following:</p>
 <p></p>
 <ul>
 <li>Choose an existing HBase cluster that is set up in HA mode to configure in 
Atlas (OR) Set up a new HBase cluster in <a class="externalLink" 
href="http://hbase.apache.org/book.html#quickstart_fully_distributed";>HA 
mode</a>.
@@ -283,7 +372,6 @@
 <h3><a name="Known_Issues"></a>Known Issues</h3>
 <p></p>
 <ul>
-<li><a class="externalLink" 
href="https://issues.apache.org/jira/browse/ATLAS-338";>ATLAS-338</a>: 
ATLAS-338: Metadata events generated from a Hive CLI (as opposed to Beeline or 
any client going <a href="./HiveServer.html">HiveServer</a>2) would be lost if 
Atlas server is down.</li>
 <li>If the HBase region servers hosting the Atlas &#x2018;titan&#x2019; HTable 
are down, Atlas would not be able to store or retrieve metadata from HBase 
until they are brought back online.</li></ul></div>
                   </div>
           </div>

http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/blob/a876d178/InstallationSteps.html
----------------------------------------------------------------------
diff --git a/InstallationSteps.html b/InstallationSteps.html
index 687708c..45c05e1 100644
--- a/InstallationSteps.html
+++ b/InstallationSteps.html
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2016-01-05
+ | Generated by Apache Maven Doxia at 2016-04-25
  | Rendered using Apache Maven Fluido Skin 1.3.0
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20160105" />
+    <meta name="Date-Revision-yyyymmdd" content="20160425" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache Atlas &#x2013; Building & Installing Apache Atlas</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
@@ -189,7 +189,7 @@
         
                 
                     
-                  <li id="publishDate" class="pull-right">Last Published: 
2016-01-05</li> <li class="divider pull-right">|</li>
+                  <li id="publishDate" class="pull-right">Last Published: 
2016-04-25</li> <li class="divider pull-right">|</li>
               <li id="projectVersion" class="pull-right">Version: 
0.7-incubating-SNAPSHOT</li>
             
                             </ul>
@@ -209,7 +209,7 @@ git clone 
https://git-wip-us.apache.org/repos/asf/incubator-atlas.git atlas
 
 cd atlas
 
-export MAVEN_OPTS=&quot;-Xmx1024m -XX:MaxPermSize=256m&quot; &amp;&amp; mvn 
clean install
+export MAVEN_OPTS=&quot;-Xmx1024m -XX:MaxPermSize=512m&quot; &amp;&amp; mvn 
clean install
 
 </pre></div>
 <p>Once the build successfully completes, artifacts can be packaged for 
deployment.</p>
@@ -233,8 +233,9 @@ mvn clean package -Pdist
    |- cputil.py
 |- conf
    |- atlas-application.properties
-   |- client.properties
    |- atlas-env.sh
+   |- hbase
+      |- hbase-site.xml.template
    |- log4j.xml
    |- solr
       |- currency.xml
@@ -246,6 +247,10 @@ mvn clean package -Pdist
       |- stopwords.txt
       |- synonyms.txt
 |- docs
+|- hbase
+   |- bin
+   |- conf
+   ...
 |- server
    |- webapp
       |- atlas.war
@@ -256,18 +261,21 @@ mvn clean package -Pdist
 |- CHANGES.txt
 
 
-</pre></div></div>
+</pre></div>
+<p>Note that HBase is included in the distribution so that a standalone 
instance of HBase can be started as the default storage backend for the graph 
repository.  During Atlas installation the conf/hbase/hbase-site.xml.template 
gets expanded and moved to hbase/conf/hbase-site.xml for the initial standalone 
HBase configuration.  To configure ATLAS graph persistence for a different 
HBase instance, please see &quot;Graph persistence engine - HBase&quot; in the 
<a href="./Configuration.html">Configuration</a> section.</p></div>
 <div class="section">
-<h4><a name="Installing__Running_Atlas"></a>Installing &amp; Running Atlas</h4>
-<p><b>Installing Atlas</b></p>
+<h4><a name="Installing__Running_Atlas"></a>Installing &amp; Running 
Atlas</h4></div>
+<div class="section">
+<h5><a name="Installing_Atlas"></a>Installing Atlas</h5>
 <div class="source">
 <pre>
 tar -xzvf apache-atlas-${project.version}-bin.tar.gz
 
 cd atlas-${project.version}
 
-</pre></div>
-<p><b>Configuring Atlas</b></p>
+</pre></div></div>
+<div class="section">
+<h5><a name="Configuring_Atlas"></a>Configuring Atlas</h5>
 <p>By default config directory used by Atlas is {package dir}/conf. To 
override this set environment variable ATLAS_CONF to the path of the conf 
dir.</p>
 <p>atlas-env.sh has been added to the Atlas conf. This file can be used to set 
various environment variables that you need for you services. In addition you 
can set any other environment variables you might need. This file will be 
sourced by atlas scripts before any commands are executed. The following 
environment variables are available to set.</p>
 <div class="source">
@@ -306,6 +314,27 @@ cd atlas-${project.version}
 #export ATLAS_EXPANDED_WEBAPP_DIR=
 
 </pre></div>
+<p><b>Settings to support large number of metadata objects</b></p>
+<p>If you plan to store several tens of thousands of metadata objects, it is 
recommended that you use values tuned for better GC performance of the JVM.</p>
+<p>The following values are common server side options:</p>
+<div class="source">
+<pre>
+export ATLAS_SERVER_OPTS=&quot;-server -XX:SoftRefLRUPolicyMSPerMB=0 
-XX:+CMSClassUnloadingEnabled -XX:+UseConcMarkSweepGC 
-XX:+CMSParallelRemarkEnabled -XX:+PrintTenuringDistribution 
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=dumps/atlas_server.hprof 
-Xloggc:logs/gc-worker.log -verbose:gc -XX:+UseGCLogFileRotation 
-XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1m -XX:+PrintGCDetails 
-XX:+PrintHeapAtGC -XX:+PrintGCTimeStamps&quot;
+
+</pre></div>
+<p>The <tt>-XX:SoftRefLRUPolicyMSPerMB</tt> option was found to be 
particularly helpful to regulate GC performance for query heavy workloads with 
many concurrent users.</p>
+<p>The following values are recommended for JDK 7:</p>
+<div class="source">
+<pre>
+export ATLAS_SERVER_HEAP=&quot;-Xms15360m -Xmx15360m -XX:MaxNewSize=3072m 
-XX:PermSize=100M -XX:MaxPermSize=512m&quot;
+
+</pre></div>
+<p>The following values are recommended for JDK 8:</p>
+<div class="source">
+<pre>
+export ATLAS_SERVER_HEAP=&quot;-Xms15360m -Xmx15360m -XX:MaxNewSize=5120m 
-XX:MetaspaceSize=100M -XX:MaxMetaspaceSize=512m&quot;
+
+</pre></div>
 <p><b>NOTE for Mac OS users</b> If you are using a Mac OS, you will need to 
configure the ATLAS_SERVER_OPTS (explained above).</p>
 <p>In  {package dir}/conf/atlas-env.sh uncomment the following line</p>
 <div class="source">
@@ -323,8 +352,8 @@ export ATLAS_SERVER_OPTS=&quot;-Djava.awt.headless=true 
-Djava.security.krb5.rea
 <p>By default, Atlas uses Titan as the graph repository and is the only graph 
repository implementation available currently. The HBase versions currently 
supported are 1.1.x. For configuring ATLAS graph persistence on HBase, please 
see &quot;Graph persistence engine - HBase&quot; in the <a 
href="./Configuration.html">Configuration</a> section for more details.</p>
 <p>Pre-requisites for running HBase as a distributed cluster</p>
 <ul>
-<li>3 or 5 <a href="./ZooKeeper.html">ZooKeeper</a> nodes</li>
-<li>Atleast 3 <a href="./RegionServer.html">RegionServer</a> nodes. It would 
be ideal to run the <a href="./DataNodes.html">DataNodes</a> on the same hosts 
as the Region servers for data locality.</li></ul>
+<li>3 or 5 ZooKeeper nodes</li>
+<li>Atleast 3 RegionServer nodes. It would be ideal to run the DataNodes on 
the same hosts as the Region servers for data locality.</li></ul>
 <p><b>Configuring SOLR as the Indexing Backend for the Graph Repository</b></p>
 <p>By default, Atlas uses Titan as the graph repository and is the only graph 
repository implementation available currently. For configuring Titan to work 
with Solr, please follow the instructions below</p>
 <p></p>
@@ -332,7 +361,7 @@ export ATLAS_SERVER_OPTS=&quot;-Djava.awt.headless=true 
-Djava.security.krb5.rea
 <li>Install solr if not already running. The version of SOLR supported is 
5.2.1. Could be installed from <a class="externalLink" 
href="http://archive.apache.org/dist/lucene/solr/5.2.1/solr-5.2.1.tgz";>http://archive.apache.org/dist/lucene/solr/5.2.1/solr-5.2.1.tgz</a></li></ul>
 <p></p>
 <ul>
-<li>Start solr in cloud mode.</li></ul><a 
href="./SolrCloud.html">SolrCloud</a> mode uses a <a 
href="./ZooKeeper.html">ZooKeeper</a> Service as a highly available, central 
location for cluster management.   For a small cluster, running with an 
existing <a href="./ZooKeeper.html">ZooKeeper</a> quorum should be fine. For 
larger clusters, you would want to run separate multiple <a 
href="./ZooKeeper.html">ZooKeeper</a> quorum with atleast 3 servers.   Note: 
Atlas currently supports solr in &quot;cloud&quot; mode only. &quot;http&quot; 
mode is not supported. For more information, refer solr documentation - <a 
class="externalLink" 
href="https://cwiki.apache.org/confluence/display/solr/SolrCloud";>https://cwiki.apache.org/confluence/display/solr/SolrCloud</a>
+<li>Start solr in cloud mode.</li></ul>SolrCloud mode uses a ZooKeeper Service 
as a highly available, central location for cluster management.   For a small 
cluster, running with an existing ZooKeeper quorum should be fine. For larger 
clusters, you would want to run separate multiple ZooKeeper quorum with atleast 
3 servers.   Note: Atlas currently supports solr in &quot;cloud&quot; mode 
only. &quot;http&quot; mode is not supported. For more information, refer solr 
documentation - <a class="externalLink" 
href="https://cwiki.apache.org/confluence/display/solr/SolrCloud";>https://cwiki.apache.org/confluence/display/solr/SolrCloud</a>
 <p></p>
 <ul>
 <li>For e.g., to bring up a Solr node listening on port 8983 on a machine, you 
can use the command:</li></ul>
@@ -351,7 +380,7 @@ export ATLAS_SERVER_OPTS=&quot;-Djava.awt.headless=true 
-Djava.security.krb5.rea
   bin/solr create -c fulltext_index -d SOLR_CONF -shards #numShards 
-replicationFactor #replicationFactor
 
 </pre></div>
-<p>Note: If numShards and replicationFactor are not specified, they default to 
1 which suffices if you are trying out solr with ATLAS on a single node 
instance.   Otherwise specify numShards according to the number of hosts that 
are in the Solr cluster and the maxShardsPerNode configuration.   The number of 
shards cannot exceed the total number of Solr nodes in your SolrCloud 
cluster.</p>
+<p>Note: If numShards and replicationFactor are not specified, they default to 
1 which suffices if you are trying out solr with ATLAS on a single node 
instance.   Otherwise specify numShards according to the number of hosts that 
are in the Solr cluster and the maxShardsPerNode configuration.   The number of 
shards cannot exceed the total number of Solr nodes in your !SolrCloud 
cluster.</p>
 <p>The number of replicas (replicationFactor) can be set according to the 
redundancy required.</p>
 <p></p>
 <ul>
@@ -367,8 +396,15 @@ export ATLAS_SERVER_OPTS=&quot;-Djava.awt.headless=true 
-Djava.security.krb5.rea
 <ul>
 <li>Restart Atlas</li></ul>
 <p>For more information on Titan solr configuration , please refer <a 
class="externalLink" 
href="http://s3.thinkaurelius.com/docs/titan/0.5.4/solr.htm";>http://s3.thinkaurelius.com/docs/titan/0.5.4/solr.htm</a></p>
-<p>Pre-requisites for running Solr in cloud mode   * Memory - Solr is both 
memory and CPU intensive. Make sure the server running Solr has adequate 
memory, CPU and disk.     Solr works well with 32GB RAM. Plan to provide as 
much memory as possible to Solr process   * Disk - If the number of entities 
that need to be stored are large, plan to have at least 500 GB free space in 
the volume where Solr is going to store the index data   * <a 
href="./SolrCloud.html">SolrCloud</a> has support for replication and sharding. 
It is highly recommended to use <a href="./SolrCloud.html">SolrCloud</a> with 
at least two Solr nodes running on different servers with replication enabled.  
   If using <a href="./SolrCloud.html">SolrCloud</a>, then you also need <a 
href="./ZooKeeper.html">ZooKeeper</a> installed and configured with 3 or 5 <a 
href="./ZooKeeper.html">ZooKeeper</a> nodes</p>
-<p><b>Starting Atlas Server</b></p>
+<p>Pre-requisites for running Solr in cloud mode   * Memory - Solr is both 
memory and CPU intensive. Make sure the server running Solr has adequate 
memory, CPU and disk.     Solr works well with 32GB RAM. Plan to provide as 
much memory as possible to Solr process   * Disk - If the number of entities 
that need to be stored are large, plan to have at least 500 GB free space in 
the volume where Solr is going to store the index data   * SolrCloud has 
support for replication and sharding. It is highly recommended to use SolrCloud 
with at least two Solr nodes running on different servers with replication 
enabled.     If using SolrCloud, then you also need ZooKeeper installed and 
configured with 3 or 5 ZooKeeper nodes</p></div>
+<div class="section">
+<h5><a name="Setting_up_Atlas"></a>Setting up Atlas</h5>
+<p>There are a few steps that setup dependencies of Atlas. One such example is 
setting up the Titan schema in the storage backend of choice. In a simple 
single server setup, these are automatically setup with default configuration 
when the server first accesses these dependencies.</p>
+<p>However, there are scenarios when we may want to run setup steps explicitly 
as one time operations. For example, in a multiple server scenario using <a 
href="./HighAvailability.html">High Availability</a>, it is preferable to run 
setup steps from one of the server instances the first time, and then start the 
services.</p>
+<p>To run these steps one time, execute the command <tt>bin/atlas_start.py 
-setup</tt> from a single Atlas server instance.</p>
+<p>However, the Atlas server does take care of parallel executions of the 
setup steps. Also, running the setup steps multiple times is idempotent. 
Therefore, if one chooses to run the setup steps as part of server startup, for 
convenience, then they should enable the configuration option 
<tt>atlas.server.run.setup.on.start</tt> by defining it with the value 
<tt>true</tt> in the <tt>atlas-application.properties</tt> file.</p></div>
+<div class="section">
+<h5><a name="Starting_Atlas_Server"></a>Starting Atlas Server</h5>
 <div class="source">
 <pre>
 bin/atlas_start.py [-port &lt;port&gt;]
@@ -377,8 +413,10 @@ bin/atlas_start.py [-port &lt;port&gt;]
 <p>By default,</p>
 <ul>
 <li>To change the port, use -port option.</li>
-<li>atlas server starts with conf from {package dir}/conf. To override this 
(to use the same conf with multiple atlas upgrades), set environment variable 
ATLAS_CONF to the path of conf dir</li></ul>
-<p><b>Using Atlas</b></p>
+<li>atlas server starts with conf from {package dir}/conf. To override this 
(to use the same conf with multiple atlas upgrades), set environment variable 
ATLAS_CONF to the path of conf dir</li></ul></div>
+<div class="section">
+<h4><a name="Using_Atlas"></a>Using Atlas</h4>
+<p></p>
 <ul>
 <li>Quick start model - sample model and data</li></ul>
 <div class="source">
@@ -424,13 +462,20 @@ bin/atlas_start.py [-port &lt;port&gt;]
 
 </pre></div>
 <p><b>Dashboard</b></p>
-<p>Once atlas is started, you can view the status of atlas entities using the 
Web-based dashboard. You can open your browser at the corresponding port to use 
the web UI.</p>
-<p><b>Stopping Atlas Server</b></p>
+<p>Once atlas is started, you can view the status of atlas entities using the 
Web-based dashboard. You can open your browser at the corresponding port to use 
the web UI.</p></div>
+<div class="section">
+<h4><a name="Stopping_Atlas_Server"></a>Stopping Atlas Server</h4>
 <div class="source">
 <pre>
 bin/atlas_stop.py
 
 </pre></div></div>
+<div class="section">
+<h4><a name="Known_Issues"></a>Known Issues</h4></div>
+<div class="section">
+<h5><a name="Setup"></a>Setup</h5>
+<p>If the setup of Atlas service fails due to any reason, the next run of 
setup (either by an explicit invocation of <tt>atlas_start.py -setup</tt> or by 
enabling the configuration option <tt>atlas.server.run.setup.on.start</tt>) 
will fail with a message such as <tt>A previous setup run may not have 
completed cleanly.</tt>. In such cases, you would need to manually ensure the 
setup can run and delete the Zookeeper node at 
<tt>/apache_atlas/setup_in_progress</tt> before attempting to run setup 
again.</p>
+<p>If the setup failed due to HBase Titan schema setup errors, it may be 
necessary to repair the HBase schema. If no data has been stored, one can also 
disable and drop the 'titan' schema in HBase to let setup run again.</p></div>
                   </div>
           </div>
 

http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/blob/a876d178/Notification-Entity.html
----------------------------------------------------------------------
diff --git a/Notification-Entity.html b/Notification-Entity.html
index 184fd90..0471908 100644
--- a/Notification-Entity.html
+++ b/Notification-Entity.html
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2016-01-05
+ | Generated by Apache Maven Doxia at 2016-04-25
  | Rendered using Apache Maven Fluido Skin 1.3.0
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20160105" />
+    <meta name="Date-Revision-yyyymmdd" content="20160425" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache Atlas &#x2013; Entity Change Notifications</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
@@ -189,7 +189,7 @@
         
                 
                     
-                  <li id="publishDate" class="pull-right">Last Published: 
2016-01-05</li> <li class="divider pull-right">|</li>
+                  <li id="publishDate" class="pull-right">Last Published: 
2016-04-25</li> <li class="divider pull-right">|</li>
               <li id="projectVersion" class="pull-right">Version: 
0.7-incubating-SNAPSHOT</li>
             
                             </ul>

http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/blob/a876d178/QuickStart.html
----------------------------------------------------------------------
diff --git a/QuickStart.html b/QuickStart.html
index 2bb35b0..8f32fe4 100644
--- a/QuickStart.html
+++ b/QuickStart.html
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2016-01-05
+ | Generated by Apache Maven Doxia at 2016-04-25
  | Rendered using Apache Maven Fluido Skin 1.3.0
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20160105" />
+    <meta name="Date-Revision-yyyymmdd" content="20160425" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache Atlas &#x2013; Quick Start Guide</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
@@ -189,7 +189,7 @@
         
                 
                     
-                  <li id="publishDate" class="pull-right">Last Published: 
2016-01-05</li> <li class="divider pull-right">|</li>
+                  <li id="publishDate" class="pull-right">Last Published: 
2016-04-25</li> <li class="divider pull-right">|</li>
               <li id="projectVersion" class="pull-right">Version: 
0.7-incubating-SNAPSHOT</li>
             
                             </ul>

http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/blob/a876d178/Repository.html
----------------------------------------------------------------------
diff --git a/Repository.html b/Repository.html
index ef67868..ed0f43d 100644
--- a/Repository.html
+++ b/Repository.html
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2016-01-05
+ | Generated by Apache Maven Doxia at 2016-04-25
  | Rendered using Apache Maven Fluido Skin 1.3.0
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20160105" />
+    <meta name="Date-Revision-yyyymmdd" content="20160425" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache Atlas &#x2013; Repository</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
@@ -189,7 +189,7 @@
         
                 
                     
-                  <li id="publishDate" class="pull-right">Last Published: 
2016-01-05</li> <li class="divider pull-right">|</li>
+                  <li id="publishDate" class="pull-right">Last Published: 
2016-04-25</li> <li class="divider pull-right">|</li>
               <li id="projectVersion" class="pull-right">Version: 
0.7-incubating-SNAPSHOT</li>
             
                             </ul>

http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/blob/a876d178/Search.html
----------------------------------------------------------------------
diff --git a/Search.html b/Search.html
index 8f89001..ac79c42 100644
--- a/Search.html
+++ b/Search.html
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2016-01-05
+ | Generated by Apache Maven Doxia at 2016-04-25
  | Rendered using Apache Maven Fluido Skin 1.3.0
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20160105" />
+    <meta name="Date-Revision-yyyymmdd" content="20160425" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache Atlas &#x2013; Search</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
@@ -189,7 +189,7 @@
         
                 
                     
-                  <li id="publishDate" class="pull-right">Last Published: 
2016-01-05</li> <li class="divider pull-right">|</li>
+                  <li id="publishDate" class="pull-right">Last Published: 
2016-04-25</li> <li class="divider pull-right">|</li>
               <li id="projectVersion" class="pull-right">Version: 
0.7-incubating-SNAPSHOT</li>
             
                             </ul>
@@ -264,14 +264,14 @@ literal: booleanConstant |
 <p>Grammar language: {noformat} opt(a)     =&gt; a is optional ~            
=&gt; a combinator. 'a ~ b' means a followed by b rep         =&gt; zero or 
more rep1sep =&gt; one or more, separated by second arg. {noformat}</p>
 <p>Language Notes:</p>
 <ul>
-<li>A <b><a href="./SingleQuery.html">SingleQuery</a></b> expression can be 
used to search for entities of a <i>Trait</i> or 
<i>Class</i>.</li></ul>Entities can be filtered based on a 'Where Clause' and 
Entity Attributes can be retrieved based on a 'Select Clause'.
+<li>A <b>SingleQuery</b> expression can be used to search for entities of a 
<i>Trait</i> or <i>Class</i>.</li></ul>Entities can be filtered based on a 
'Where Clause' and Entity Attributes can be retrieved based on a 'Select 
Clause'.
 <ul>
-<li>An Entity Graph can be traversed/joined by combining one or more <a 
href="./SingleQueries.html">SingleQueries</a>.</li>
+<li>An Entity Graph can be traversed/joined by combining one or more 
SingleQueries.</li>
 <li>An attempt is made to make the expressions look SQL like by accepting 
keywords &quot;SELECT&quot;,</li></ul>&quot;FROM&quot;, and &quot;WHERE&quot;; 
but these are optional and users can simply think in terms of Entity Graph 
Traversals.
 <ul>
 <li>The transitive closure of an Entity relationship can be expressed via the 
<i>Loop</i> expression. A</li></ul><i>Loop</i> expression can be any traversal 
(recursively a query) that represents a <i>Path</i> that ends in an Entity of 
the same <i>Type</i> as the starting Entity.
 <ul>
-<li>The <i><a href="./WithPath.html">WithPath</a></i> clause can be used with 
transitive closure queries to retrieve the Path that</li></ul>connects the two 
related Entities. (We also provide a higher level interface for Closure Queries 
  see scaladoc for 'org.apache.atlas.query.ClosureQuery')
+<li>The <i>WithPath</i> clause can be used with transitive closure queries to 
retrieve the Path that</li></ul>connects the two related Entities. (We also 
provide a higher level interface for Closure Queries   see scaladoc for 
'org.apache.atlas.query.ClosureQuery')
 <ul>
 <li>There are couple of Predicate functions different from SQL:
 <ul>
@@ -285,7 +285,7 @@ literal: booleanConstant |
 <li>from DB</li>
 <li>DB where name=&quot;Reporting&quot; select name, owner</li>
 <li>DB has name</li>
-<li>DB is <a href="./JdbcAccess.html">JdbcAccess</a></li>
+<li>DB is JdbcAccess</li>
 <li>Column where Column isa PII</li>
 <li>Table where name=&quot;sales_fact&quot;, columns</li>
 <li>Table where name=&quot;sales_fact&quot;, columns as column select 
column.name, column.dataType, column.comment</li>

http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/blob/a876d178/Security.html
----------------------------------------------------------------------
diff --git a/Security.html b/Security.html
index 8af726c..9649d81 100644
--- a/Security.html
+++ b/Security.html
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2016-01-05
+ | Generated by Apache Maven Doxia at 2016-04-25
  | Rendered using Apache Maven Fluido Skin 1.3.0
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20160105" />
+    <meta name="Date-Revision-yyyymmdd" content="20160425" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache Atlas &#x2013; Security Features of Apache Atlas</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
@@ -189,7 +189,7 @@
         
                 
                     
-                  <li id="publishDate" class="pull-right">Last Published: 
2016-01-05</li> <li class="divider pull-right">|</li>
+                  <li id="publishDate" class="pull-right">Last Published: 
2016-04-25</li> <li class="divider pull-right">|</li>
               <li id="projectVersion" class="pull-right">Version: 
0.7-incubating-SNAPSHOT</li>
             
                             </ul>
@@ -217,7 +217,8 @@
 <li><code>keystore.file</code> - the path to the keystore file leveraged by 
the server.  This file contains the server certificate.</li>
 <li><code>truststore.file</code> - the path to the truststore file. This file 
contains the certificates of other trusted entities (e.g. the certificates for 
client processes if two-way SSL is enabled).  In most instances this can be set 
to the same value as the keystore.file property (especially if one-way SSL is 
enabled).</li>
 <li><code>client.auth.enabled</code> (false|true) [default: false] - 
enable/disable client authentication.  If enabled, the client will have to 
authenticate to the server during the transport session key creation process 
(i.e. two-way SSL is in effect).</li>
-<li><code>cert.stores.credential.provider.path</code> - the path to the 
Credential Provider store file.  The passwords for the keystore, truststore, 
and server certificate are maintained in this secure file.  Utilize the cputil 
script in the 'bin' directoy (see below) to populate this file with the 
passwords required.</li></ul></div>
+<li><code>cert.stores.credential.provider.path</code> - the path to the 
Credential Provider store file.  The passwords for the keystore, truststore, 
and server certificate are maintained in this secure file.  Utilize the cputil 
script in the 'bin' directoy (see below) to populate this file with the 
passwords required.</li>
+<li><code>atlas.ssl.exclude.cipher.suites</code> - the excluded Cipher Suites 
list -  <b>NULL.</b>,.*RC4.*,.*MD5.*,.*DES.*,.*DSS.* are weak and unsafe Cipher 
Suites that are excluded by default. If additional Ciphers need to be excluded, 
set this property with the default Cipher Suites such as 
atlas.ssl.exclude.cipher.suites=.*NULL.*, .*RC4.*, .*MD5.*, .*DES.*, .*DSS.*, 
and add the additional Ciper Suites to the list with a comma separator. They 
can be added with their full name or a regular expression. The Cipher Suites 
listed in the atlas.ssl.exclude.cipher.suites property will have precedence 
over the default Cipher Suites. One would keep the default Cipher Suites, and 
add additional ones to be safe.</li></ul></div>
 <div class="section">
 <h5><a name="Credential_Provider_Utility_Script"></a>Credential Provider 
Utility Script</h5>
 <p>In order to prevent the use of clear-text passwords, the Atlas platofrm 
makes use of the Credential Provider facility for secure password storage (see 
<a class="externalLink" 
href="http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CommandsManual.html#credential";>Hadoop
 Credential Command Reference</a> for more information about this facility).  
The cputil script in the 'bin' directory can be leveraged to create the 
password store required.</p>
@@ -284,7 +285,7 @@
 <p>For a more detailed discussion of the HTTP authentication mechanism refer 
to <a class="externalLink" 
href="http://hadoop.apache.org/docs/stable/hadoop-auth/Configuration.html";>Hadoop
 Auth, Java HTTP SPNEGO 2.6.0 - Server Side Configuration</a>.  The prefix that 
document references is &quot;atlas.http.authentication&quot; in the case of the 
Atlas authentication implementation.</p></div>
 <div class="section">
 <h4><a name="Client_security_configuration"></a>Client security 
configuration</h4>
-<p>When leveraging Atlas client code to communicate with an Atlas server 
configured for SSL transport and/or Kerberos authentication, there is a 
requirement to provide a client configuration file that provides the security 
properties that allow for communication with, or authenticating to, the server. 
Create a client.properties file with the appropriate settings (see below) and 
place it on the client's classpath or in the directory specified by the 
&quot;atlas.conf&quot; system property.</p>
+<p>When leveraging Atlas client code to communicate with an Atlas server 
configured for SSL transport and/or Kerberos authentication, there is a 
requirement to provide the Atlas client configuration file that provides the 
security properties that allow for communication with, or authenticating to, 
the server. Update the atlas-application.properties file with the appropriate 
settings (see below) and copy it to the client's classpath or to the directory 
specified by the &quot;atlas.conf&quot; system property.</p>
 <p>The client properties for SSL communication are:</p>
 <p></p>
 <ul>

http://git-wip-us.apache.org/repos/asf/incubator-atlas-website/blob/a876d178/StormAtlasHook.html
----------------------------------------------------------------------
diff --git a/StormAtlasHook.html b/StormAtlasHook.html
new file mode 100644
index 0000000..b6c3099
--- /dev/null
+++ b/StormAtlasHook.html
@@ -0,0 +1,298 @@
+<!DOCTYPE html>
+<!--
+ | Generated by Apache Maven Doxia at 2016-04-25
+ | Rendered using Apache Maven Fluido Skin 1.3.0
+-->
+<html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <meta name="Date-Revision-yyyymmdd" content="20160425" />
+    <meta http-equiv="Content-Language" content="en" />
+    <title>Apache Atlas &#x2013; Storm Atlas Bridge</title>
+    <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
+    <link rel="stylesheet" href="./css/site.css" />
+    <link rel="stylesheet" href="./css/print.css" media="print" />
+
+      
+    <script type="text/javascript" 
src="./js/apache-maven-fluido-1.3.0.min.js"></script>
+
+                          
+        
+<script type="text/javascript">$( document ).ready( function() { $( 
'.carousel' ).carousel( { interval: 3500 } ) } );</script>
+          
+            </head>
+        <body class="topBarEnabled">
+          
+                        
+                    
+                
+
+    <div id="topbar" class="navbar navbar-fixed-top ">
+      <div class="navbar-inner">
+                                  <div class="container" style="width: 
68%;"><div class="nav-collapse">
+            
+                
+                                <ul class="nav">
+                          <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Atlas <b 
class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="index.html"  title="About">About</a>
+</li>
+                  
+                      <li>      <a 
href="https://cwiki.apache.org/confluence/display/ATLAS";  title="Wiki">Wiki</a>
+</li>
+                  
+                      <li>      <a 
href="https://cwiki.apache.org/confluence/display/ATLAS";  title="News">News</a>
+</li>
+                  
+                      <li>      <a 
href="https://git-wip-us.apache.org/repos/asf/incubator-atlas.git";  
title="Git">Git</a>
+</li>
+                  
+                      <li>      <a 
href="https://issues.apache.org/jira/browse/ATLAS";  title="Jira">Jira</a>
+</li>
+                  
+                      <li>      <a 
href="https://cwiki.apache.org/confluence/display/ATLAS/PoweredBy";  
title="Powered by">Powered by</a>
+</li>
+                  
+                      <li>      <a href="http://blogs.apache.org/atlas/";  
title="Blog">Blog</a>
+</li>
+                          </ul>
+      </li>
+                <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Project 
Information <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="project-info.html"  
title="Summary">Summary</a>
+</li>
+                  
+                      <li>      <a href="mail-lists.html"  title="Mailing 
Lists">Mailing Lists</a>
+</li>
+                  
+                      <li>      <a 
href="http://webchat.freenode.net?channels=apacheatlas&uio=d4";  
title="IRC">IRC</a>
+</li>
+                  
+                      <li>      <a href="team-list.html"  title="Team">Team</a>
+</li>
+                  
+                      <li>      <a href="issue-tracking.html"  title="Issue 
Tracking">Issue Tracking</a>
+</li>
+                  
+                      <li>      <a href="source-repository.html"  
title="Source Repository">Source Repository</a>
+</li>
+                  
+                      <li>      <a href="license.html"  
title="License">License</a>
+</li>
+                          </ul>
+      </li>
+                <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Releases <b 
class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a 
href="http://www.apache.org/dyn/closer.cgi/incubator/atlas/0.6.0-incubating/";  
title="0.6-incubating">0.6-incubating</a>
+</li>
+                  
+                      <li>      <a 
href="http://www.apache.org/dyn/closer.cgi/incubator/atlas/0.5.0-incubating/";  
title="0.5-incubating">0.5-incubating</a>
+</li>
+                          </ul>
+      </li>
+                <li class="dropdown">
+        <a href="#" class="dropdown-toggle" 
data-toggle="dropdown">Documentation <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="0.6.0-incubating/index.html"  
title="0.6-incubating">0.6-incubating</a>
+</li>
+                  
+                      <li>      <a href="0.5.0-incubating/index.html"  
title="0.5-incubating">0.5-incubating</a>
+</li>
+                          </ul>
+      </li>
+                <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">ASF <b 
class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a 
href="http://www.apache.org/foundation/how-it-works.html";  title="How Apache 
Works">How Apache Works</a>
+</li>
+                  
+                      <li>      <a href="http://www.apache.org/foundation/";  
title="Foundation">Foundation</a>
+</li>
+                  
+                      <li>      <a 
href="http://www.apache.org/foundation/sponsorship.html";  title="Sponsoring 
Apache">Sponsoring Apache</a>
+</li>
+                  
+                      <li>      <a 
href="http://www.apache.org/foundation/thanks.html";  title="Thanks">Thanks</a>
+</li>
+                          </ul>
+      </li>
+                  </ul>
+          
+                      <form id="search-form" 
action="http://www.google.com/search"; method="get"  class="navbar-search 
pull-right" >
+    
+  <input value="http://atlas.incubator.apache.org"; name="sitesearch" 
type="hidden"/>
+  <input class="search-query" name="q" id="query" type="text" />
+</form>
+<script type="text/javascript" 
src="http://www.google.com/coop/cse/brand?form=search-form";></script>
+          
+                            
+            
+            
+            
+    <iframe 
src="http://www.facebook.com/plugins/like.php?href=http://atlas.incubator.apache.org/atlas-docs&send=false&layout=button_count&show-faces=false&action=like&colorscheme=dark";
+        scrolling="no" frameborder="0"
+        style="border:none; width:80px; height:20px; margin-top: 10px;"  
class="pull-right" ></iframe>
+                        
+    <script type="text/javascript" 
src="https://apis.google.com/js/plusone.js";></script>
+
+        <ul class="nav pull-right"><li style="margin-top: 10px;">
+    
+    <div class="g-plusone" 
data-href="http://atlas.incubator.apache.org/atlas-docs"; data-size="medium"  
width="60px" align="right" ></div>
+
+        </li></ul>
+                              
+                   
+                      </div>
+          
+        </div>
+      </div>
+    </div>
+    
+        <div class="container">
+          <div id="banner">
+        <div class="pull-left">
+                                                  <a href=".." id="bannerLeft">
+                                                                               
                 <img src="images/atlas-logo.png"  alt="Apache Atlas" 
width="200px" height="45px"/>
+                </a>
+                      </div>
+        <div class="pull-right">                  <a 
href="http://incubator.apache.org"; id="bannerRight">
+                                                                               
                 <img src="images/apache-incubator-logo.png"  alt="Apache 
Incubator"/>
+                </a>
+      </div>
+        <div class="clear"><hr/></div>
+      </div>
+
+      <div id="breadcrumbs">
+        <ul class="breadcrumb">
+                
+                    
+                              <li class="">
+                    <a href="http://www.apache.org"; class="externalLink" 
title="Apache">
+        Apache</a>
+        </li>
+      <li class="divider ">/</li>
+            <li class="">
+                    <a href="index.html" title="Atlas">
+        Atlas</a>
+        </li>
+      <li class="divider ">/</li>
+        <li class="">Storm Atlas Bridge</li>
+        
+                
+                    
+                  <li id="publishDate" class="pull-right">Last Published: 
2016-04-25</li> <li class="divider pull-right">|</li>
+              <li id="projectVersion" class="pull-right">Version: 
0.7-incubating-SNAPSHOT</li>
+            
+                            </ul>
+      </div>
+
+      
+                        
+        <div id="bodyColumn" >
+                                  
+            <div class="section">
+<h2><a name="Storm_Atlas_Bridge"></a>Storm Atlas Bridge</h2></div>
+<div class="section">
+<h3><a name="Introduction"></a>Introduction</h3>
+<p>Apache Storm is a distributed real-time computation system. Storm makes it 
easy to reliably process unbounded streams of data, doing for real-time 
processing what Hadoop did for batch processing. The process is essentially a 
DAG of nodes, which is called <b>topology</b>.</p>
+<p>Apache Atlas is a metadata repository that enables end-to-end data lineage, 
search and associate business classification.</p>
+<p>The goal of this integration is to push the operational topology metadata 
along with the underlying data source(s), target(s), derivation processes and 
any available business context so Atlas can capture the lineage for this 
topology.</p>
+<p>There are 2 parts in this process detailed below:</p>
+<ul>
+<li>Data model to represent the concepts in Storm</li>
+<li>Storm Atlas Hook to update metadata in Atlas</li></ul></div>
+<div class="section">
+<h3><a name="Storm_Data_Model"></a>Storm Data Model</h3>
+<p>A data model is represented as Types in Atlas. It contains the descriptions 
of various nodes in the topology graph, such as spouts and bolts and the 
corresponding producer and consumer types.</p>
+<p>The following types are added in Atlas.</p>
+<p></p>
+<ul>
+<li>storm_topology - represents the coarse-grained topology. A storm_topology 
derives from an Atlas Process type and hence can be used to inform Atlas about 
lineage.</li>
+<li>Following data sets are added - kafka_topic, jms_topic, hbase_table, 
hdfs_data_set. These all derive from an Atlas Dataset type and hence form the 
end points of a lineage graph.</li>
+<li>storm_spout - Data Producer having outputs, typically Kafka, JMS</li>
+<li>storm_bolt - Data Consumer having inputs and outputs, typically Hive, 
HBase, HDFS, etc.</li></ul>
+<p>The Storm Atlas hook auto registers dependent models like the Hive data 
model if it finds that these are not known to the Atlas server.</p>
+<p>The data model for each of the types is described in the class definition 
at org.apache.atlas.storm.model.StormDataModel.</p></div>
+<div class="section">
+<h3><a name="Storm_Atlas_Hook"></a>Storm Atlas Hook</h3>
+<p>Atlas is notified when a new topology is registered successfully in Storm. 
Storm provides a hook, backtype.storm.ISubmitterHook, at the Storm client used 
to submit a storm topology.</p>
+<p>The Storm Atlas hook intercepts the hook post execution and extracts the 
metadata from the topology and updates Atlas using the types defined. Atlas 
implements the Storm client hook interface in 
org.apache.atlas.storm.hook.StormAtlasHook.</p></div>
+<div class="section">
+<h3><a name="Limitations"></a>Limitations</h3>
+<p>The following apply for the first version of the integration.</p>
+<p></p>
+<ul>
+<li>Only new topology submissions are registered with Atlas, any lifecycle 
changes are not reflected in Atlas.</li>
+<li>The Atlas server needs to be online when a Storm topology is submitted for 
the metadata to be captured.</li>
+<li>The Hook currently does not support capturing lineage for custom spouts 
and bolts.</li></ul></div>
+<div class="section">
+<h3><a name="Installation"></a>Installation</h3>
+<p>The Storm Atlas Hook needs to be manually installed in Storm on the client 
side. The hook artifacts are available at: $ATLAS_PACKAGE/hook/storm</p>
+<p>Storm Atlas hook jars need to be copied to $STORM_HOME/extlib. Replace 
STORM_HOME with storm installation path.</p>
+<p>Restart all daemons after you have installed the atlas hook into 
Storm.</p></div>
+<div class="section">
+<h3><a name="Configuration"></a>Configuration</h3></div>
+<div class="section">
+<h4><a name="Storm_Configuration"></a>Storm Configuration</h4>
+<p>The Storm Atlas Hook needs to be configured in Storm client config in 
<b>$STORM_HOME/conf/storm.yaml</b> as:</p>
+<div class="source">
+<pre>
+storm.topology.submission.notifier.plugin.class: 
&quot;org.apache.atlas.storm.hook.StormAtlasHook&quot;
+
+</pre></div>
+<p>Also set a 'cluster name' that would be used as a namespace for objects 
registered in Atlas. This name would be used for namespacing the Storm 
topology, spouts and bolts.</p>
+<p>The other objects like data sets should ideally be identified with the 
cluster name of the components that generate them. For e.g. Hive tables and 
databases should be identified using the cluster name set in Hive. The Storm 
Atlas hook will pick this up if the Hive configuration is available in the 
Storm topology jar that is submitted on the client and the cluster name is 
defined there. This happens similarly for HBase data sets. In case this 
configuration is not available, the cluster name set in the Storm configuration 
will be used.</p>
+<div class="source">
+<pre>
+atlas.cluster.name: &quot;cluster_name&quot;
+
+</pre></div>
+<p>In <b>$STORM_HOME/conf/storm_env.ini</b>, set an environment variable as 
follows:</p>
+<div class="source">
+<pre>
+STORM_JAR_JVM_OPTS:&quot;-Datlas.conf=$ATLAS_HOME/conf/&quot;
+
+</pre></div>
+<p>where ATLAS_HOME is pointing to where ATLAS is installed.</p>
+<p>You could also set this up programatically in Storm Config as:</p>
+<div class="source">
+<pre>
+    Config stormConf = new Config();
+    ...
+    stormConf.put(Config.STORM_TOPOLOGY_SUBMISSION_NOTIFIER_PLUGIN,
+            org.apache.atlas.storm.hook.StormAtlasHook.class.getName());
+
+</pre></div></div>
+                  </div>
+          </div>
+
+    <hr/>
+
+    <footer>
+            <div class="container">
+              <div class="row span12">Copyright &copy;                    
2015-2016
+                        <a href="http://www.apache.org";>Apache Software 
Foundation</a>.
+            All Rights Reserved.      
+                    
+      </div>
+
+                          
+                <p id="poweredBy" class="pull-right">
+                          <a href="http://maven.apache.org/"; title="Built by 
Maven" class="poweredBy">
+        <img class="builtBy" alt="Built by Maven" 
src="./images/logos/maven-feather.png" />
+      </a>
+              </p>
+        
+                </div>
+    </footer>
+  </body>
+</html>


Reply via email to