[ 
https://issues.apache.org/jira/browse/ATLAS-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Mani updated ATLAS-2649:
-------------------------------
    Summary: Updated Hive Hook to create lineage between HBase table and Hive 
table  (was: Hive Hook should create lineage entities when storage handler 
mechanism to create hbase tables via hive)

> Updated Hive Hook to create lineage between HBase table and Hive table
> ----------------------------------------------------------------------
>
>                 Key: ATLAS-2649
>                 URL: https://issues.apache.org/jira/browse/ATLAS-2649
>             Project: Atlas
>          Issue Type: Bug
>    Affects Versions: trunk
>            Reporter: Ramesh Mani
>            Assignee: Ramesh Mani
>            Priority: Major
>             Fix For: trunk
>
>         Attachments: 
> 0001-ATLAS-2649-Hive-Hook-should-create-lineage-entities-.patch
>
>
> Hive Hook should create lineage entities when storage handler mechanism to 
> create hbase tables via hive.
> When Hive on HBase is done via Hive's HBaseStorageHandler mechanism, 
> corresponding HBase table is created in HBase and data is store in it. In 
> this process Hive Hook should show Input process as Hive Table and Output as 
> HBase Table.
> e.g
> CREATE TABLE hbase_table_emp(id int, name string, role string) 
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:name,cf1:role")
> TBLPROPERTIES ("hbase.table.name" = "emp");
> This will create a corresponding HBase table emp
> hbase(main):003:0> list
> TABLE
> ATLAS_ENTITY_AUDIT_EVENTS
> atlas_janus
> emp
> 3 row(s)
> Took 0.0127 seconds
> => ["ATLAS_ENTITY_AUDIT_EVENTS", "atlas_janus", "emp"]
> hbase(main):004:0> describe 'emp'
> Table emp is ENABLED
> emp
> COLUMN FAMILIES DESCRIPTION
> {NAME => 'cf1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', 
> NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS =>
> 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL 
> => 'FOREVER', MIN_VERSIONS => '0', REPLICATION
> _SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', 
> IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'fals
> e', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 
> 'true', BLOCKSIZE => '65536'}
> 1 row(s)
> Took 0.1961 seconds
>  
> In this process the Hive hook should provide the lineage info for the 
> corresponding Hive table -> HBase Table Storage.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to