Ramesh Mani created ATLAS-2649:
----------------------------------
Summary: Hive Hook should create lineage entities when storage
handler mechanism to create hbase tables via hive
Key: ATLAS-2649
URL: https://issues.apache.org/jira/browse/ATLAS-2649
Project: Atlas
Issue Type: Bug
Reporter: Ramesh Mani
Hive Hook should create lineage entities when storage handler mechanism to
create hbase tables via hive.
When Hive on HBase is done via Hive's HBaseStorageHandler mechanism,
corresponding HBase table is created in HBase and data is store in it. In this
process Hive Hook should show Input process as Hive Table and Output as HBase
Table.
e.g
CREATE TABLE hbase_table_emp(id int, name string, role string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:name,cf1:role")
TBLPROPERTIES ("hbase.table.name" = "emp");
This will create a corresponding HBase table emp
hbase(main):003:0> list
TABLE
ATLAS_ENTITY_AUDIT_EVENTS
atlas_janus
emp
3 row(s)
Took 0.0127 seconds
=> ["ATLAS_ENTITY_AUDIT_EVENTS", "atlas_janus", "emp"]
hbase(main):004:0> describe 'emp'
Table emp is ENABLED
emp
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false',
NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS =>
'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL =>
'FOREVER', MIN_VERSIONS => '0', REPLICATION
_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY
=> 'false', CACHE_BLOOMS_ON_WRITE => 'fals
e', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE =>
'true', BLOCKSIZE => '65536'}
1 row(s)
Took 0.1961 seconds
In this process the Hive hook should provide the lineage info for the
corresponding Hive table -> HBase Table Storage.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)