ATLAS-2317:[Docs] Add HBase Bridge Documents Signed-off-by: Madhan Neethiraj <mad...@apache.org>
Project: http://git-wip-us.apache.org/repos/asf/atlas/repo Commit: http://git-wip-us.apache.org/repos/asf/atlas/commit/540129f5 Tree: http://git-wip-us.apache.org/repos/asf/atlas/tree/540129f5 Diff: http://git-wip-us.apache.org/repos/asf/atlas/diff/540129f5 Branch: refs/heads/master Commit: 540129f5c39181d347a3f85d85715186c8a8f066 Parents: c9924fd Author: rmani <rm...@hortonworks.com> Authored: Tue Apr 24 14:01:50 2018 -0700 Committer: Madhan Neethiraj <mad...@apache.org> Committed: Wed Apr 25 10:40:36 2018 -0700 ---------------------------------------------------------------------- docs/src/site/twiki/Bridge-HBase.twiki | 62 +++++++++++++++++++++++++++++ docs/src/site/twiki/index.twiki | 12 +++--- 2 files changed, 69 insertions(+), 5 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/atlas/blob/540129f5/docs/src/site/twiki/Bridge-HBase.twiki ---------------------------------------------------------------------- diff --git a/docs/src/site/twiki/Bridge-HBase.twiki b/docs/src/site/twiki/Bridge-HBase.twiki new file mode 100644 index 0000000..7a5c908 --- /dev/null +++ b/docs/src/site/twiki/Bridge-HBase.twiki @@ -0,0 +1,62 @@ +---+ HBase Atlas Bridge + +---++ HBase Model +The default HBase model includes the following types: + * Entity types: + * hbase_namespace + * super-types: !Asset + * attributes: name, owner, description, type, classifications, term, clustername, parameters, createtime, modifiedtime, qualifiedName + * hbase_table + * super-types: !DataSet + * attributes: name, owner, description, type, classifications, term, uri, column_families, namespace, parameters, createtime, modifiedtime, maxfilesize, + isReadOnly, isCompactionEnabled, isNormalizationEnabled, ReplicaPerRegion, Durability, qualifiedName + * hbase_column_family + * super-types: !DataSet + * attributes: name, owner, description, type, classifications, term, columnns, createtime, bloomFilterType, compressionType, CompactionCompressionType, EncryptionType, + inMemoryCompactionPolicy, keepDeletedCells, Maxversions, MinVersions, datablockEncoding, storagePolicy, Ttl, blockCachedEnabled, cacheBloomsOnWrite, + cacheDataOnWrite, EvictBlocksOnClose, PerfectBlocksOnOpen, NewVersionsBehavior, isMobEnbaled, MobCompactPartitionPolicy, qualifiedName + +The entities are created and de-duped using unique qualified name. They provide namespace and can be used for querying as well: + * hbase_namespace.qualifiedName - <namespace>@<clusterName> + * hbase_table.qualifiedName - <namespace>:<tableName>@<clusterName> + * hbase_column_family.qualifiedName - <namespace>:<tableName>.<columnFamily>@<clusterName> + + +---++ Importing HBase Metadata +org.apache.atlas.hbase.bridge.HBaseBridge imports the HBase metadata into Atlas using the model defined above. import-hbase.sh command can be used to facilitate this. + <verbatim> + Usage 1: <atlas package>/hook-bin/import-hbase.sh + Usage 2: <atlas package>/hook-bin/import-hbase.sh [-n <namespace regex> OR --namespace <namespace regex >] [-t <table regex > OR --table <table regex>] + Usage 3: <atlas package>/hook-bin/import-hbase.sh [-f <filename>] + File Format: + namespace1:tbl1 + namespace1:tbl2 + namespace2:tbl1 + </verbatim> + +The logs are in <atlas package>/logs/import-hbase.log + +---++ HBase Hook +Atlas HBase hook registers with HBase to listen for create/update/delete operations and updates the metadata in Atlas, via Kafka notifications, for the changes in HBase. +Follow the instructions below to setup Atlas hook in HBase: + * Set-up Atlas hook in hbase-site.xml by adding the following: + <verbatim> + <property> + <name>hbase.coprocessor.master.classes</name> + <value>org.apache.atlas.hbase.hook.HBaseAtlasCoprocessor</value> + </property></verbatim> + * Copy <atlas package>/hook/hbase/<All files and folder> to hbase class path. HBase hook binary files are present in apache-atlas-<release-vesion>-SNAPSHOT-hbase-hook.tar.gz + * Copy <atlas-conf>/atlas-application.properties to the hbase conf directory. + +The following properties in <atlas-conf>/atlas-application.properties control the thread pool and notification details: + * atlas.hook.hbase.synchronous - boolean, true to run the hook synchronously. default false. Recommended to be set to false to avoid delays in Hbase operation. + * atlas.hook.hbase.numRetries - number of retries for notification failure. default 3 + * atlas.hook.hbase.minThreads - core number of threads. default 1 + * atlas.hook.hbase.maxThreads - maximum number of threads. default 5 + * atlas.hook.hbase.keepAliveTime - keep alive time in msecs. default 10 + * atlas.hook.hbase.queueSize - queue size for the threadpool. default 10000 + +Refer [[Configuration][Configuration]] for notification related configurations + +---++ NOTES + * Only the namespace, table and columnfamily create / update / delete operations are caputured by the hook. Columns changes wont be captured and propagated. \ No newline at end of file http://git-wip-us.apache.org/repos/asf/atlas/blob/540129f5/docs/src/site/twiki/index.twiki ---------------------------------------------------------------------- diff --git a/docs/src/site/twiki/index.twiki b/docs/src/site/twiki/index.twiki index 5e9a1cb..df7e7a3 100755 --- a/docs/src/site/twiki/index.twiki +++ b/docs/src/site/twiki/index.twiki @@ -57,11 +57,13 @@ capabilities around these data assets for data scientists, analysts and the data * [[Configuration][Configuration]] * Notification * [[Notification-Entity][Entity Notification]] - * Bridges - * [[Bridge-Hive][Hive Bridge]] - * [[Bridge-Sqoop][Sqoop Bridge]] - * [[Bridge-Falcon][Falcon Bridge]] - * [[StormAtlasHook][Storm Bridge]] + * Hooks & Bridges + * [[Bridge-HBase][HBase Hook & Bridge]] + * [[Bridge-Hive][Hive Hook & Bridge]] + * [[Bridge-Kafka][Kafka Bridge]] + * [[Bridge-Sqoop][Sqoop Hook]] + * [[StormAtlasHook][Storm Hook]] + * [[Bridge-Falcon][Falcon Hook]] * [[HighAvailability][Fault Tolerance And High Availability Options]] ---++ API Documentation