Repository: falcon
Updated Branches:
  refs/heads/master fc34d42cb -> 85345ad7e


FALCON-1106 Documentation for extensions

Author: Sowmya Ramesh <[email protected]>

Reviewers: "Balu Vellanki <[email protected]>", Ying Zheng 
<[email protected]>"

Closes #120 from sowmyaramesh/FALCON-1106


Project: http://git-wip-us.apache.org/repos/asf/falcon/repo
Commit: http://git-wip-us.apache.org/repos/asf/falcon/commit/85345ad7
Tree: http://git-wip-us.apache.org/repos/asf/falcon/tree/85345ad7
Diff: http://git-wip-us.apache.org/repos/asf/falcon/diff/85345ad7

Branch: refs/heads/master
Commit: 85345ad7e7421fbd25829381f27eb5b165d2f8d0
Parents: fc34d42
Author: Sowmya Ramesh <[email protected]>
Authored: Tue May 3 14:45:40 2016 -0700
Committer: Sowmya Ramesh <[email protected]>
Committed: Tue May 3 14:45:40 2016 -0700

----------------------------------------------------------------------
 addons/extensions/hdfs-mirroring/README         |  11 +-
 addons/extensions/hive-mirroring/README         |  43 +----
 addons/hivedr/README                            |  16 +-
 docs/src/site/twiki/EntitySpecification.twiki   |   8 +-
 docs/src/site/twiki/Extensions.twiki            |  55 +++++++
 docs/src/site/twiki/FalconDocumentation.twiki   |   6 +-
 docs/src/site/twiki/HDFSDR.twiki                |  34 ----
 docs/src/site/twiki/HDFSMirroring.twiki         |  27 ++++
 docs/src/site/twiki/HiveDR.twiki                |  80 ----------
 docs/src/site/twiki/HiveMirroring.twiki         |  63 ++++++++
 docs/src/site/twiki/Recipes.twiki               |  85 ----------
 .../site/twiki/falconcli/DefineExtension.twiki  |   8 +
 .../twiki/falconcli/DescribeExtension.twiki     |   8 +
 .../twiki/falconcli/EnumerateExtension.twiki    |   8 +
 docs/src/site/twiki/falconcli/FalconCLI.twiki   |  16 +-
 .../twiki/restapi/ExtensionDefinition.twiki     | 160 +++++++++++++++++++
 .../twiki/restapi/ExtensionDescription.twiki    |  24 +++
 .../twiki/restapi/ExtensionEnumeration.twiki    |  38 +++++
 docs/src/site/twiki/restapi/ResourceList.twiki  |  14 +-
 19 files changed, 442 insertions(+), 262 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/addons/extensions/hdfs-mirroring/README
----------------------------------------------------------------------
diff --git a/addons/extensions/hdfs-mirroring/README 
b/addons/extensions/hdfs-mirroring/README
index 78f1726..24c2bd4 100644
--- a/addons/extensions/hdfs-mirroring/README
+++ b/addons/extensions/hdfs-mirroring/README
@@ -14,16 +14,17 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-HDFS Directory Replication Extension
+HDFS Mirroring Extension
 
 Overview
-This extension implements replicating arbitrary directories on HDFS from one
-Hadoop cluster to another Hadoop cluster.
-This piggy backs on replication solution in Falcon which uses the DistCp tool.
+Falcon supports HDFS mirroring extension to replicate data from source cluster 
to destination cluster.
+This extension implements replicating arbitrary directories on HDFS and piggy 
backs on replication solution in Falcon which uses the DistCp tool.
+It also allows users to replicate data from on-premise to cloud, either Azure 
WASB or S3.
+
 
 Use Case
 * Copy directories between HDFS clusters with out dated partitions
 * Archive directories from HDFS to Cloud. Ex: S3, Azure WASB
 
 Limitations
-As the data volume and number of files grow, this can get inefficient.
+* As the data volume and number of files grow, this can get inefficient.

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/addons/extensions/hive-mirroring/README
----------------------------------------------------------------------
diff --git a/addons/extensions/hive-mirroring/README 
b/addons/extensions/hive-mirroring/README
index 827f7e5..04637c0 100644
--- a/addons/extensions/hive-mirroring/README
+++ b/addons/extensions/hive-mirroring/README
@@ -14,45 +14,18 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-Hive Metastore Disaster Recovery Recipe
+Hive Mirroring Extension
 
 Overview
-This extension implements replicating hive metadata and data from one
-Hadoop cluster to another Hadoop cluster.
-This piggy backs on replication solution in Falcon which uses the DistCp tool.
+Falcon provides feature to replicate Hive metadata and data events from source 
cluster to destination cluster.
+This is supported for both secure and unsecure cluster through Falcon 
extensions. Falcon uses event­based replication capability provided by hive to 
implement the Hive mirroring feature.
+Falcon will act as admin/user­facing tool which will have fine control on 
what and how to replicate as defined by its users, while leaving the delta, 
data and metadata management to hive itself.
+Hive mirroring extension piggy backs on Distcp tool for replication.
 
 Use Case
-*
-*
+* Replicate data/metadata of Hive DB & table from source to target cluster
 
 Limitations
-*
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-Hive Metastore Disaster Recovery Extension
+* Currently Hive doesn't support create database, roles, views, offline 
tables, direct HDFS writes without registering with metadata and Database/Table 
name mapping replication events.
+Hence Hive mirroring extension cannot be used to replicate above mentioned 
events between warehouses.
 
-Overview
-This extension implements replicating hive metadata and data from one
-Hadoop cluster to another Hadoop cluster.
-This piggy backs on replication solution in Falcon which uses the DistCp tool.
-
-Use Case
-*
-*
-
-Limitations
-*

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/addons/hivedr/README
----------------------------------------------------------------------
diff --git a/addons/hivedr/README b/addons/hivedr/README
index 0b448d3..161ed1b 100644
--- a/addons/hivedr/README
+++ b/addons/hivedr/README
@@ -14,20 +14,19 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-Hive Disaster Recovery
+Hive Mirroring
 =======================
 
 Overview
 ---------
 
-Falcon provides feature to replicate Hive metadata and data events from one 
hadoop cluster
-to another cluster. This is supported for secure and unsecure cluster through 
Falcon Recipes.
+Falcon provides feature to replicate Hive metadata and data events from source 
cluster to destination cluster. This is supported for both secure and unsecure 
cluster through Falcon extensions.
 
 
 Prerequisites
 -------------
 
-Following is the prerequisites to use Hive DR
+Following is the prerequisites to use Hive mirrroing
 
 * Hive 1.2.0+
 * Oozie 4.2.0+
@@ -69,12 +68,9 @@ a. Perform initial bootstrap of Table and Database from one 
Hadoop cluster to an
 b. Setup cluster definition
    $FALCON_HOME/bin/falcon entity -submit -type cluster -file 
/cluster/definition.xml
 
-c. Submit Hive DR recipe
-   $FALCON_HOME/bin/falcon recipe -name hive-disaster-recovery -operation 
HIVE_DISASTER_RECOVERY
+c. Submit Hive mirroring extension
+   $FALCON_HOME/bin/falcon extension -submitAndSchedule -extensionName 
hive-mirroring -file /process/definition.xml
 
+   Please Refer to Falcon CLI and REST API twiki in the Falcon documentation 
for more details on usage of CLI and REST API's for extension jobs and 
instances management.
 
-Recipe templates for Hive DR is available in 
addons/recipe/hive-disaster-recovery and copy it to
-recipe path specified in client.properties.
 
-*Note:* If kerberos security is enabled on cluster, use the secure templates 
for Hive DR from
-        addons/recipe/hive-disaster-recovery
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/EntitySpecification.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/EntitySpecification.twiki 
b/docs/src/site/twiki/EntitySpecification.twiki
index 7eedf87..b27e341 100644
--- a/docs/src/site/twiki/EntitySpecification.twiki
+++ b/docs/src/site/twiki/EntitySpecification.twiki
@@ -922,10 +922,10 @@ The workflow is re-tried after 10 mins, 20 mins and 30 
mins. With exponential ba
 
 To enable retries for instances for feeds, user will have to set the following 
properties in runtime.properties
 <verbatim>
-falcon.recipe.retry.policy=periodic
-falcon.recipe.retry.delay=minutes(30)
-falcon.recipe.retry.attempts=3
-falcon.recipe.retry.onTimeout=false
+falcon.retry.policy=periodic
+falcon.retry.delay=minutes(30)
+falcon.retry.attempts=3
+falcon.retry.onTimeout=false
 <verbatim>
 ---+++ Late data
 Late data handling defines how the late data should be handled. Each feed is 
defined with a late cut-off value which specifies the time till which late data 
is valid. For example, late cut-off of hours(6) means that data for nth hour 
can get delayed by upto 6 hours. Late data specification in process defines how 
this late data is handled.

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/Extensions.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/Extensions.twiki 
b/docs/src/site/twiki/Extensions.twiki
new file mode 100644
index 0000000..6b4bf11
--- /dev/null
+++ b/docs/src/site/twiki/Extensions.twiki
@@ -0,0 +1,55 @@
+---+ Falcon Extensions
+
+---++ Overview
+
+A Falcon extension is a static process template with parameterized workflow to 
realize a specific use case and enable non-programmers to capture and re-use 
very complex business logic. Extensions are defined in server space. Objective 
of the extension is to solve a standard data management function that can be 
invoked as a tool using the standard Falcon features (REST API, CLI and UI 
access) supporting standard falcon features.
+
+For example:
+
+   * Replicating directories from one HDFS cluster to another (not timed 
partitions)
+   * Replicating hive metadata (database, table, views, etc.)
+   * Replicating between HDFS and Hive - either way
+   * Data masking etc.
+
+---++ Proposal
+
+Falcon provides a Process abstraction that encapsulates the configuration for 
a user workflow with scheduling controls. All extensions can be modeled as a 
Process and its dependent feeds with in Falcon which executes the user
+workflow periodically. The process and its associated workflow are 
parameterized. The user will provide properties which are <name, value> pairs 
that are substituted by falcon before scheduling it. Falcon translates these 
extensions
+as a process entity by replacing the parameters in the workflow definition.
+
+---++ Falcon extension artifacts to manage extensions
+
+Extension artifacts are published in addons/extensions. Artifacts are expected 
to be installed on HDFS at "extension.store.uri" path defined in startup 
properties. Each extension is expected to ahve the below artifacts
+   * json file under META directory lists all the required and optional 
parameters/arguments for scheduling extension job
+   * process entity template to be scheduled under resources directory
+   * parameterized workflow under resources directory
+   * required libs under the libs directory
+   * README describing the functionality achieved by extension
+
+REST API and CLI support has been added for extension artifact management on 
HDFS. Please Refer to [[falconcli/FalconCLI][Falcon CLI]] and 
[[restapi/ResourceList][REST API]] for more details.
+
+---++ CLI and REST API support
+REST APIs and CLI support has been added to manage extension jobs and 
instances.
+
+Please Refer to [[falconcli/FalconCLI][Falcon CLI]] and 
[[restapi/ResourceList][REST API]] for more details on usage of CLI and REST 
API's for extension jobs and instances management.
+
+---++ Metrics
+HDFS mirroring and Hive mirroring extensions will capture the replication 
metrics like TIMETAKEN, BYTESCOPIED, COPY (number of files copied) for an 
instance and populate to the GraphDB.
+
+---++ Sample extensions
+
+Sample extensions are published in addons/extensions
+
+---++ Types of extensions
+   * [[HDFSMirroring][HDFS mirroring extension]]
+   * [[HiveMirroring][Hive mirroring extension]]
+
+---++ Packaging and installation
+
+Extension artifacts in addons/extensions are packaged in falcon war under 
extensions directory. For manual installation user is expected to install the 
extension artifacts under extensions in falcon war to HDFS at 
"extension.store.uri" path defined in startup properties and then restart 
Falcon.
+
+---++ Migration
+Recipes framework and HDFS mirroring capability was added in Apache Falcon 
0.6.0 release and it was client side logic. With 0.10 release its moved to 
server side and renamed as server side extensions. Client side recipes only had 
CLI support and expected certain pre steps to get it working. This is no longer 
required in 0.10 release as new CLI and REST API support has been provided.
+
+If user is migrating to 0.10 release and above then old Recipe setup and CLI's 
won't work. For manual installation user is expected to copy Extension 
artifacts to HDFS. Please refer "Packaging and installation" section above for 
more details.
+Please Refer to [[falconcli/FalconCLI][Falcon CLI]] and 
[[restapi/ResourceList][REST API]] for more details on usage of CLI and REST 
API's for extension jobs and instances management.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/FalconDocumentation.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/FalconDocumentation.twiki 
b/docs/src/site/twiki/FalconDocumentation.twiki
index 2d67070..89370ec 100644
--- a/docs/src/site/twiki/FalconDocumentation.twiki
+++ b/docs/src/site/twiki/FalconDocumentation.twiki
@@ -13,7 +13,7 @@
    * <a href="#Falcon_EL_Expressions">Falcon EL Expressions</a>
    * <a href="#Lineage">Lineage</a>
    * <a href="#Security">Security</a>
-   * <a href="#Recipes">Recipes</a>
+   * <a href="#Extensions">Extensions</a>
    * <a href="#Monitoring">Monitoring</a>
    * <a href="#Email_Notification">Email Notification</a>
    * <a href="#Backwards_Compatibility">Backwards Compatibility 
Instructions</a>
@@ -738,9 +738,9 @@ lifecycle policies such as replication and retention.
 
 Security is detailed in [[Security][Security]].
 
----++ Recipes
+---++ Extensions
 
-Recipes is detailed in [[Recipes][Recipes]].
+Extensions is detailed in [[Extensions][Extensions]].
 
 ---++ Monitoring
 

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/HDFSDR.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/HDFSDR.twiki b/docs/src/site/twiki/HDFSDR.twiki
deleted file mode 100644
index 1c1e3f5..0000000
--- a/docs/src/site/twiki/HDFSDR.twiki
+++ /dev/null
@@ -1,34 +0,0 @@
----+ HDFS DR Recipe
----++ Overview
-Falcon supports HDFS DR recipe to replicate data from source cluster to 
destination cluster.
-
----++ Usage
----+++ Setup cluster definition.
-   <verbatim>
-    $FALCON_HOME/bin/falcon entity -submit -type cluster -file 
/cluster/definition.xml
-   </verbatim>
-
----+++ Update recipes properties
-   Copy HDFS replication recipe properties, workflow and template file from 
$FALCON_HOME/data-mirroring/hdfs-replication to the accessible
-   directory path or to the recipe directory path (*falcon.recipe.path=<recipe 
directory path>*). *"falcon.recipe.path"* must be specified
-   in Falcon conf client.properties. Now update the copied recipe properties 
file with required attributes to replicate data from source cluster to
-   destination cluster for HDFS DR.
-
----+++ Submit HDFS DR recipe
-
-   After updating the recipe properties file with required attributes in 
directory path or in falcon.recipe.path,
-   there are two ways of submitting the HDFS DR recipe:
-
-   * 1. Specify Falcon recipe properties file through recipe command line.
-   <verbatim>
-    $FALCON_HOME/bin/falcon recipe -name hdfs-replication -operation 
HDFS_REPLICATION
-    -properties /cluster/hdfs-replication.properties
-   </verbatim>
-
-   * 2. Use Falcon recipe path specified in Falcon conf client.properties .
-   <verbatim>
-    $FALCON_HOME/bin/falcon recipe -name hdfs-replication -operation 
HDFS_REPLICATION
-   </verbatim>
-
-
-*Note:* Recipe properties file, workflow file and template file name must 
match to the recipe name, it must be unique and in the same directory.

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/HDFSMirroring.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/HDFSMirroring.twiki 
b/docs/src/site/twiki/HDFSMirroring.twiki
new file mode 100644
index 0000000..a810947
--- /dev/null
+++ b/docs/src/site/twiki/HDFSMirroring.twiki
@@ -0,0 +1,27 @@
+---+ HDFS mirroring Extension
+---++ Overview
+Falcon supports HDFS mirroring extension to replicate data from source cluster 
to destination cluster. This extension implements replicating arbitrary 
directories on HDFS and piggy backs on replication solution in Falcon which 
uses the DistCp tool. It also allows users to replicate data from on-premise to 
cloud, either Azure WASB or S3.
+
+---++ Use Case
+* Copy directories between HDFS clusters with out dated partitions
+* Archive directories from HDFS to Cloud. Ex: S3, Azure WASB
+
+---++ Limitations
+As the data volume and number of files grow, this can get inefficient.
+
+---++ Usage
+---+++ Setup source and destination clusters
+   <verbatim>
+    $FALCON_HOME/bin/falcon entity -submit -type cluster -file 
/cluster/definition.xml
+   </verbatim>
+
+---+++ HDFS mirroring extension properties
+   Extension artifacts are expected to be installed on HDFS at the path 
specified by "extension.store.uri" in startup properties. 
hdfs-mirroring-properties.json file located at 
"<extension.store.uri>/hdfs-mirroring/META/hdfs-mirroring-properties.json" 
lists all the required and optional parameters/arguments for scheduling HDFS 
mirroring job.
+
+---+++ Submit and schedule HDFS mirroring extension
+
+   <verbatim>
+    $FALCON_HOME/bin/falcon extension -submitAndSchedule -extensionName 
hdfs-mirroring -file /process/definition.xml
+   </verbatim>
+
+   Please Refer to [[falconcli/FalconCLI][Falcon CLI]] and 
[[restapi/ResourceList][REST API]] for more details on usage of CLI and REST 
API's.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/HiveDR.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/HiveDR.twiki b/docs/src/site/twiki/HiveDR.twiki
deleted file mode 100644
index cf35694..0000000
--- a/docs/src/site/twiki/HiveDR.twiki
+++ /dev/null
@@ -1,80 +0,0 @@
----+Hive Disaster Recovery
-
-
----++Overview
-Falcon provides feature to replicate Hive metadata and data events from source 
cluster
-to destination cluster. This is supported for secure and unsecure cluster 
through Falcon Recipes.
-
-
----++Prerequisites
-Following is the prerequisites to use Hive DR
-
-   * *Hive 1.2.0+*
-   * *Oozie 4.2.0+*
-
-*Note:* Set following properties in hive-site.xml for replicating the Hive 
events on source and destination Hive cluster:
-<verbatim>
-    <property>
-        <name>hive.metastore.event.listeners</name>
-        <value>org.apache.hive.hcatalog.listener.DbNotificationListener</value>
-        <description>event listeners that are notified of any metastore 
changes</description>
-    </property>
-
-    <property>
-        <name>hive.metastore.dml.events</name>
-        <value>true</value>
-    </property>
-</verbatim>
-
----++ Usage
----+++ Bootstrap
-   Perform initial bootstrap of Table and Database from source cluster to 
destination cluster
-   * *Database Bootstrap*
-     For bootstrapping DB replication, first destination DB should be created. 
This step is expected,
-     since DB replication definitions can be set up by users only on 
pre-existing DB’s. Second, Export all tables in
-     the source db and Import it in the destination db, as described in Table 
bootstrap.
-
-   * *Table Bootstrap*
-     For bootstrapping table replication, essentially after having turned on 
the !DbNotificationListener
-     on the source db, perform an Export of the table, distcp the Export over 
to the destination
-     warehouse and do an Import over there. Check the following 
[[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ImportExport][Hive
 Export-Import]] for syntax details
-     and examples.
-     This will set up the destination table so that the events on the source 
cluster that modify the table
-     will then be replicated.
-
----+++ Setup cluster definition
-   <verbatim>
-    $FALCON_HOME/bin/falcon entity -submit -type cluster -file 
/cluster/definition.xml
-   </verbatim>
-
----+++ Update recipes properties
-   Copy Hive DR recipe properties, workflow and template file from 
$FALCON_HOME/data-mirroring/hive-disaster-recovery to the accessible
-   directory path or to the recipe directory path (*falcon.recipe.path=<recipe 
directory path>*). *"falcon.recipe.path"* must be specified
-   in Falcon conf client.properties. Now update the copied recipe properties 
file with required attributes to replicate metadata and data from source 
cluster to
-   destination cluster for Hive DR.
-
-   * *Note : HiveDR on TDE encrypted clusters*
-   When submitting HiveDR recipe in a kerberos secured setup, it is possible 
that the source and target staging directories
-   are encrypted using Transparent Data Encryption (TDE). If your cluster dirs 
are TDE encrypted, please set
-   "tdeEncryptionEnabled=true" in the recipe properties file. Default value 
for this property is "false".
-
----+++ Submit Hive DR recipe
-   After updating the recipe properties file with required attributes in 
directory path or in falcon.recipe.path,
-   there are two ways of submitting the Hive DR recipe:
-
-   * 1. Specify Falcon recipe properties file through recipe command line.
-   <verbatim>
-       $FALCON_HOME/bin/falcon recipe -name hive-disaster-recovery -operation 
HIVE_DISASTER_RECOVERY
-       -properties /cluster/hive-disaster-recovery.properties
-   </verbatim>
-
-   * 2. Use Falcon recipe path specified in Falcon conf client.properties .
-   <verbatim>
-       $FALCON_HOME/bin/falcon recipe -name hive-disaster-recovery -operation 
HIVE_DISASTER_RECOVERY
-   </verbatim>
-
-
-*Note:*
-   * Recipe properties file, workflow file and template file name must match 
to the recipe name, it must be unique and in the same directory.
-   * If kerberos security is enabled on cluster, use the secure templates for 
Hive DR from $FALCON_HOME/data-mirroring/hive-disaster-recovery .
-

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/HiveMirroring.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/HiveMirroring.twiki 
b/docs/src/site/twiki/HiveMirroring.twiki
new file mode 100644
index 0000000..e28502a
--- /dev/null
+++ b/docs/src/site/twiki/HiveMirroring.twiki
@@ -0,0 +1,63 @@
+---+Hive Mirroring
+
+---++Overview
+Falcon provides feature to replicate Hive metadata and data events from source 
cluster to destination cluster. This is supported for both secure and unsecure 
cluster through Falcon extensions.
+
+---++Prerequisites
+Following is the prerequisites to use Hive Mirrroring
+
+   * *Hive 1.2.0+*
+   * *Oozie 4.2.0+*
+
+*Note:* Set following properties in hive-site.xml for replicating the Hive 
events on source and destination Hive cluster:
+<verbatim>
+    <property>
+        <name>hive.metastore.event.listeners</name>
+        <value>org.apache.hive.hcatalog.listener.DbNotificationListener</value>
+        <description>event listeners that are notified of any metastore 
changes</description>
+    </property>
+
+    <property>
+        <name>hive.metastore.dml.events</name>
+        <value>true</value>
+    </property>
+</verbatim>
+
+---++ Use Case
+* Replicate data/metadata of Hive DB & table from source to target cluster
+
+---++ Limitations
+* Currently Hive doesn't support create database, roles, views, offline 
tables, direct HDFS writes without registering with metadata and Database/Table 
name mapping replication events. Hence Hive mirroring extension cannot be used 
to replicate above mentioned events between warehouses.
+
+---++ Usage
+---+++ Bootstrap
+   Perform initial bootstrap of Table and Database from source cluster to 
destination cluster
+   * *Database Bootstrap*
+     For bootstrapping DB replication, first destination DB should be created. 
This step is expected,
+     since DB replication definitions can be set up by users only on 
pre-existing DB’s. Second, Export all tables in
+     the source db and Import it in the destination db, as described in Table 
bootstrap.
+
+   * *Table Bootstrap*
+     For bootstrapping table replication, essentially after having turned on 
the !DbNotificationListener
+     on the source db, perform an Export of the table, distcp the Export over 
to the destination
+     warehouse and do an Import over there. Check the following 
[[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ImportExport][Hive
 Export-Import]] for syntax details
+     and examples.
+     This will set up the destination table so that the events on the source 
cluster that modify the table
+     will then be replicated.
+
+---+++  Setup source and destination clusters
+   <verbatim>
+    $FALCON_HOME/bin/falcon entity -submit -type cluster -file 
/cluster/definition.xml
+   </verbatim>
+
+---+++ Hive mirroring extension properties
+   Extension artifacts are expected to be installed on HDFS at the path 
specified by "extension.store.uri" in startup properties. 
hive-mirroring-properties.json file located at 
"<extension.store.uri>/hive-mirroring/META/hive-mirroring-properties.json" 
lists all the required and optional parameters/arguments for scheduling Hive 
mirroring job.
+
+---+++ Submit and schedule Hive mirroring extension
+
+   <verbatim>
+    $FALCON_HOME/bin/falcon extension -submitAndSchedule -extensionName 
hive-mirroring -file /process/definition.xml
+   </verbatim>
+
+   Please Refer to [[falconcli/FalconCLI][Falcon CLI]] and 
[[restapi/ResourceList][REST API]] for more details on usage of CLI and REST 
API's.
+

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/Recipes.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/Recipes.twiki 
b/docs/src/site/twiki/Recipes.twiki
deleted file mode 100644
index b5faa1e..0000000
--- a/docs/src/site/twiki/Recipes.twiki
+++ /dev/null
@@ -1,85 +0,0 @@
----+ Falcon Recipes
-
----++ Overview
-
-A Falcon recipe is a static process template with parameterized workflow to 
realize a specific use case. Recipes are
-defined in user space. Recipes will not have support for update or lifecycle 
management.
-
-For example:
-
-   * Replicating directories from one HDFS cluster to another (not timed 
partitions)
-   * Replicating hive metadata (database, table, views, etc.)
-   * Replicating between HDFS and Hive - either way
-   * Data masking etc.
-
----++ Proposal
-
-Falcon provides a Process abstraction that encapsulates the configuration for 
a user workflow with scheduling
-controls. All recipes can be modeled as a Process with in Falcon which 
executes the user workflow periodically. The
-process and its associated workflow are parameterized. The user will provide a 
properties file with name value pairs
-that are substituted by falcon before scheduling it. Falcon translates these 
recipes as a process entity by
-replacing the parameters in the workflow definition.
-
----++ Falcon CLI recipe support
-
-Falcon CLI functionality to support recipes has been added.
-[[falconcli/FalconCLI][Falcon CLI]] Recipe command usage is defined here.
-
-CLI accepts recipe option with a recipe name and optional tool and does the 
following:
-   * Validates the options; name option is mandatory and tool is optional and 
should be provided if user wants to override the base recipe tool
-   * Looks for <name>-workflow.xml, <name>-template.xml and <name>.properties 
file in the path specified by falcon.recipe.path in client.properties. If files 
cannot be found then Falcon CLI will fail
-   * Invokes a Tool to substitute the properties in the templated process for 
the recipe. By default invokes base tool if tool option is not passed. Tool is 
responsible for generating process entity at the path specified by FalconCLI
-   * Validates the generated entity
-   * Submit and schedule this entity
-   * Generated process entity files are stored in tmp directory
-
----++ Base Recipe tool
-
-Falcon provides a base tool that recipes can override. Base Recipe tool does 
the following:
-   * Expects recipe template file path, recipe properties file path and path 
where process entity to be submitted should be generated. Validates these 
arguments
-   * Validates the artifacts i.e. workflow and/or lib files specified in the 
recipe template exists on local filesystem or HDFS at the specified path else 
returns error
-   * Copies if the artifacts exists on local filesystem
-      * If workflow is on local FS then falcon.recipe.workflow.path in recipe 
property file is mandatory for it to be copied to HDFS. If templated process 
requires custom libs falcon.recipe.workflow.lib.path property is mandatory for 
them to be copied from Local FS to HDFS. Recipe tool will copy the local 
artifacts only if these properties are set in properties file
-   * Looks for the patten ##[A-Za-z0-9_.]*## in the templated process and 
substitutes it with the properties. Process entity generated after the 
substitution is written to the empty file passed by FalconCLI
-
----++ Recipe template file format
-
-   * Any templatized string should be in the format ##[A-Za-z0-9_.]*##.
-   * There should be a corresponding entry in the recipe properties file 
"falcon.recipe.<templatized-string> = <value to be substituted>"
-
-<verbatim>
-Example: If the entry in recipe template is <workflow 
name="##workflow.name##"> there should be a corresponding entry in the recipe 
properties file falcon.recipe.workflow.name=hdfs-dr-workflow
-</verbatim>
-
----++ Recipe properties file format
-
-   * Regular key value pair properties file
-   * Property key should be prefixed by "falcon.recipe."
-
-<verbatim>
-Example: falcon.recipe.workflow.name=hdfs-dr-workflow
-Recipe template will have <workflow name="##workflow.name##">. Recipe tool 
will look for the patten ##workflow.name##
-and replace it with the property value "hdfs-dr-workflow". Substituted 
template will have <workflow name="hdfs-dr-workflow">
-</verbatim>
-
----++ Metrics
-HDFS DR and Hive DR recipes will capture the replication metrics like 
TIMETAKEN, BYTESCOPIED, COPY (number of files copied) for an
-instance and populate to the GraphDB.
-
----++ Managing the scheduled recipe process
-   * Scheduled recipe process is similar to regular process
-      * List : falcon entity -type process -name <recipe-process-name> -list
-      * Status : falcon entity -type process -name <recipe-process-name> 
-status
-      * Delete : falcon entity -type process -name <recipe-process-name> 
-delete
-
----++ Sample recipes
-
-   * Sample recipes are published in addons/recipes
-
----++ Types of recipes
-   * [[HDFSDR][HDFS Recipe]]
-   * [[HiveDR][HiveDR Recipe]]
-
----++ Packaging
-
-   * There is no packaging for recipes at this time but will be added soon.

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/falconcli/DefineExtension.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/falconcli/DefineExtension.twiki 
b/docs/src/site/twiki/falconcli/DefineExtension.twiki
new file mode 100644
index 0000000..c260911
--- /dev/null
+++ b/docs/src/site/twiki/falconcli/DefineExtension.twiki
@@ -0,0 +1,8 @@
+---+++Definition
+
+[[CommonCLI][Common CLI Options]]
+
+Definition of an extension. Outputs a JSON document describing the extension 
invocation parameters.
+
+Usage:
+$FALCON_HOME/bin/falcon extension ­definition ­name <extension­name>

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/falconcli/DescribeExtension.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/falconcli/DescribeExtension.twiki 
b/docs/src/site/twiki/falconcli/DescribeExtension.twiki
new file mode 100644
index 0000000..9f9895e
--- /dev/null
+++ b/docs/src/site/twiki/falconcli/DescribeExtension.twiki
@@ -0,0 +1,8 @@
+---+++Describe
+
+[[CommonCLI][Common CLI Options]]
+
+Description of an extension. Outputs the README of the specified extension.
+
+Usage:
+$FALCON_HOME/bin/falcon extension ­describe ­name <extension­name>

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/falconcli/EnumerateExtension.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/falconcli/EnumerateExtension.twiki 
b/docs/src/site/twiki/falconcli/EnumerateExtension.twiki
new file mode 100644
index 0000000..0b28630
--- /dev/null
+++ b/docs/src/site/twiki/falconcli/EnumerateExtension.twiki
@@ -0,0 +1,8 @@
+---+++Enumerate
+
+[[CommonCLI][Common CLI Options]]
+
+List all the extensions supported. Returns total number of results and a list 
of server side extensions supported.
+
+Usage:
+$FALCON_HOME/bin/falcon extension ­enumerate
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/falconcli/FalconCLI.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/falconcli/FalconCLI.twiki 
b/docs/src/site/twiki/falconcli/FalconCLI.twiki
index 2290569..dedd40c 100644
--- a/docs/src/site/twiki/falconcli/FalconCLI.twiki
+++ b/docs/src/site/twiki/falconcli/FalconCLI.twiki
@@ -11,7 +11,8 @@ CLI options are classified into :
    * <a href="#Instance_Management_Commands">Instance Management Commands</a>
    * <a href="#Metadata_Commands">Metadata Commands</a>
    * <a href="#Admin_Commands">Admin commands</a>
-   * <a href="#Recipe_Commands">Recipe commands</a>
+   * <a href="#Extension_Artifacts_Commands">Extension artifacts commands</a>
+   * <a href="#Extension_Commands">Extension commands</a>
 
 
 
@@ -104,10 +105,19 @@ $FALCON_HOME/bin/falcon entity -submit -type cluster 
-file /cluster/definition.x
 
 -----------
 
----++Recipe Commands
+---++Extension artifacts management Commands
 
 | *Command*                                      | *Description*               
                    |
-|[[SubmitRecipe][Submit]]                        | Submit the specified Recipe 
                    |
+|[[EnumerateExtension][Enumerate]]               | Return all the extensions 
supported             |
+|[[DescribeExtension][Describe]]                 | Return description of an 
extension              |
+|[[DefineExtension][Definition]]                 | Return the definition of an 
extension           |
+
+-----------
+
+---++Extension Commands
+
+| *Command*                                      | *Description*               
                    |
+|[[SubmitExtension][Submit]]                     | Submit the specified 
extension                  |
 
 
 

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/restapi/ExtensionDefinition.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/restapi/ExtensionDefinition.twiki 
b/docs/src/site/twiki/restapi/ExtensionDefinition.twiki
new file mode 100644
index 0000000..66f6674
--- /dev/null
+++ b/docs/src/site/twiki/restapi/ExtensionDefinition.twiki
@@ -0,0 +1,160 @@
+---++  GET api/extension/definition/:extension­name
+   * <a href="#Description">Description</a>
+   * <a href="#Parameters">Parameters</a>
+   * <a href="#Results">Results</a>
+   * <a href="#Examples">Examples</a>
+
+---++ Description
+Get definition of the extension.
+
+---++ Parameters
+   * :extension­name Name of the extension.
+
+---++ Results
+Outputs a JSON document describing the extension invocation parameters.
+
+---++ Examples
+---+++ Rest Call
+<verbatim>
+GET http://localhost:15000/api/extension/definition/hdfs­mirroring
+</verbatim>
+---+++ Result
+<verbatim>
+{
+    "shortDescription": "This extension implements replicating arbitrary 
directories on HDFS from one Hadoop cluster to another Hadoop cluster. This 
piggy backs on replication solution in Falcon which uses the DistCp tool.",
+    "properties":[
+        {
+            "propertyName":"jobName",
+            "required":true,
+            "description":"Unique job name",
+            "example":"hdfs-monthly-sales-dr"
+        },
+        {
+            "propertyName":"jobClusterName",
+            "required":true,
+            "description":"Cluster where job should run",
+            "example":"backupCluster"
+        },
+        {
+            "propertyName":"jobValidityStart",
+            "required":true,
+            "description":"Job validity start time",
+            "example":"2016-03-03T00:00Z"
+        },
+        {
+            "propertyName":"jobValidityEnd",
+            "required":true,
+            "description":"Job validity end time",
+            "example":"2018-03-13T00:00Z"
+        },
+        {
+            "propertyName":"jobFrequency",
+            "required":true,
+            "description":"job frequency. Valid frequency types are minutes, 
hours, days, months",
+            "example":"months(1)"
+        },
+        {
+            "propertyName":"jobTimezone",
+            "required":false,
+            "description":"Time zone for the job",
+            "example":"GMT"
+        },
+        {
+            "propertyName":"jobTags",
+            "required":false,
+            "description":"list of comma separated tags. Key Value Pairs, 
separated by comma",
+            "example":"[email protected], [email protected], 
_department_type=forecasting"
+        },
+        {
+            "propertyName":"jobRetryPolicy",
+            "required":false,
+            "description":"Job retry policy",
+            "example":"periodic"
+        },
+        {
+            "propertyName":"jobRetryDelay",
+            "required":false,
+            "description":"Job retry delay",
+            "example":"minutes(30)"
+        },
+        {
+            "propertyName":"jobRetryAttempts",
+            "required":false,
+            "description":"Job retry attempts",
+            "example":"3"
+        },
+        {
+            "propertyName":"jobRetryOnTimeout",
+            "required":false,
+            "description":"Job retry on timeout",
+            "example":"true"
+        },
+        {
+            "propertyName":"jobAclOwner",
+            "required":false,
+            "description":"ACL owner",
+            "example":"ambari-qa"
+        },
+        {
+            "propertyName":"jobAclGroup",
+            "required":false,
+            "description":"ACL group",
+            "example":"users"
+        },
+        {
+            "propertyName":"jobAclPermission",
+            "required":false,
+            "description":"ACL permission",
+            "example":"0x755"
+        },
+        {
+            "propertyName":"sourceDir",
+            "required":true,
+            "description":"Multiple hdfs comma separated source directories",
+            "example":"/user/ambari-qa/primaryCluster/dr/input1, 
/user/ambari-qa/primaryCluster/dr/input2"
+        },
+        {
+            "propertyName":"sourceCluster",
+            "required":true,
+            "description":"Source cluster for hdfs mirroring",
+            "example":"primaryCluster"
+        },
+        {
+            "propertyName":"targetDir",
+            "required":true,
+            "description":"Target hdfs directory",
+            "example":"/user/ambari-qa/backupCluster/dr"
+        },
+        {
+            "propertyName":"targetCluster",
+            "required":true,
+            "description":"Target cluster for hdfs mirroring",
+            "example":"backupCluster"
+        },
+        {
+            "propertyName":"distcpMaxMaps",
+            "required":false,
+            "description":"Maximum number of mappers for DistCP",
+            "example":"1"
+        },
+        {
+            "propertyName":"distcpMapBandwidth",
+            "required":false,
+            "description":"Bandwidth in MB for each mapper in DistCP",
+            "example":"100"
+        },
+        {
+            "propertyName":"jobNotificationType",
+            "required":false,
+            "description":"Email Notification for Falcon instance completion",
+            "example":"email"
+        },
+        {
+            "propertyName":"jobNotificationReceivers",
+            "required":false,
+            "description":"Comma separated email Id's",
+            "example":"[email protected], [email protected]"
+        }
+    ]
+}
+</verbatim>

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/restapi/ExtensionDescription.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/restapi/ExtensionDescription.twiki 
b/docs/src/site/twiki/restapi/ExtensionDescription.twiki
new file mode 100644
index 0000000..5900fbb
--- /dev/null
+++ b/docs/src/site/twiki/restapi/ExtensionDescription.twiki
@@ -0,0 +1,24 @@
+---++  GET api/extension/describe/:extension­name
+   * <a href="#Description">Description</a>
+   * <a href="#Parameters">Parameters</a>
+   * <a href="#Results">Results</a>
+   * <a href="#Examples">Examples</a>
+
+---++ Description
+Description of an extension.
+
+---++ Parameters
+   * :extension­name Name of the extension.
+
+---++ Results
+Outputs the README of the specified extension.
+
+---++ Examples
+---+++ Rest Call
+<verbatim>
+GET http://localhost:15000/api/extension/describe/hdfs­mirroring
+</verbatim>
+---+++ Result
+<verbatim>
+<README file of the specified extension>
+</verbatim>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/restapi/ExtensionEnumeration.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/restapi/ExtensionEnumeration.twiki 
b/docs/src/site/twiki/restapi/ExtensionEnumeration.twiki
new file mode 100644
index 0000000..abd94c8
--- /dev/null
+++ b/docs/src/site/twiki/restapi/ExtensionEnumeration.twiki
@@ -0,0 +1,38 @@
+---++  GET api/extension/enumerate
+   * <a href="#Description">Description</a>
+   * <a href="#Parameters">Parameters</a>
+   * <a href="#Results">Results</a>
+   * <a href="#Examples">Examples</a>
+
+---++ Description
+Get list of the supported extensions.
+
+---++ Parameters
+None
+
+---++ Results
+Total number of results and a list of extension server extensions supported.
+
+---++ Examples
+---+++ Rest Call
+<verbatim>
+GET http://localhost:15000/api/extension/enumerate
+</verbatim>
+---+++ Result
+<verbatim>
+{
+    "totalResults":"2”,
+    “extensions”: [
+        {
+            “name”: “Hdfs­mirroring”
+            “type”: “Trusted/Provided extension”
+            “description”: “This extension implements replicating 
arbitrary directories on HDFS from one Hadoop cluster to another Hadoop 
cluster.”
+        },
+        {
+            “name”: “Hive­mirroring”
+            “type”: “Trusted/Provided extension”
+            “description”: “This extension implements replicating hive 
metadata and data from one Hadoop cluster to another Hadoop cluster.”
+        }
+    ]
+}
+</verbatim>

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/restapi/ResourceList.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/restapi/ResourceList.twiki 
b/docs/src/site/twiki/restapi/ResourceList.twiki
index 34c2c6f..f703843 100644
--- a/docs/src/site/twiki/restapi/ResourceList.twiki
+++ b/docs/src/site/twiki/restapi/ResourceList.twiki
@@ -6,6 +6,7 @@
    * <a href="#REST_Call_on_Admin_Resource">REST Call on Admin Resource</a>
    * <a href="#REST_Call_on_Lineage_Graph">REST Call on Lineage Graph 
Resource</a>
    * <a href="#REST_Call_on_Metadata_Resource">REST Call on Metadata 
Resource</a>
+   * <a href="#REST_Call_on_Extension_Artifact">REST Call on Extension 
artifact</a>
 
 ---++ Authentication
 
@@ -88,6 +89,13 @@ The current version of the rest api's documentation is also 
hosted on the Falcon
 
 ---++ REST Call on Metadata Discovery Resource
 
-| *Call Type* | *Resource*                                                     
                                | *Description*                                 
                                |
-| GET         | [[MetadataList][api/metadata/discovery/:dimension-type/list]]  
                                | list of dimensions  |
-| GET         | 
[MetadataRelations][api/metadata/discovery/:dimension-type/:dimension-name/relations]]
         | Return all relations of a dimension |
+| *Call Type* | *Resource*                                                     
                                | *Description*                       |
+| GET         | [[MetadataList][api/metadata/discovery/:dimension-type/list]]  
                                | list of dimensions                  |
+| GET         | 
[[MetadataRelations][api/metadata/discovery/:dimension-type/:dimension-name/relations]]
        | Return all relations of a dimension |
+
+---++ REST Call on Extension Artifact
+
+| *Call Type* | *Resource*                                                     
   | *Description*                                                          |
+| GET         | [[ExtensionEnumeration][api/extension/enumerate]]              
   | List all the extensions supported                                      |
+| GET         | 
[[ExtensionDescription][api/extension/describe/:extension­name]]  | Return the 
README of the specified extension                           |
+| GET         | 
[[ExtensionDefinition][api/extension/definition/:extension­name]] | Return a 
JSON document describing the extension invocation parameters  |

Reply via email to