[ 
https://issues.apache.org/jira/browse/ATLAS-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiaqi Shan updated ATLAS-3421:
------------------------------
    Attachment: ATLAS-3421.patch

> HiveBridge supports to skip tables without any update
> -----------------------------------------------------
>
>                 Key: ATLAS-3421
>                 URL: https://issues.apache.org/jira/browse/ATLAS-3421
>             Project: Atlas
>          Issue Type: Improvement
>          Components:  atlas-core
>            Reporter: Jiaqi Shan
>            Priority: Major
>         Attachments: ATLAS-3421.patch
>
>
> h1. *Background*
> In my company, There are more than 80000+ hive tables in production.  We will 
> import hive metadata to Atlas everyday to avoid any change is missed by Hive 
> Hook. This timed task causes two problem:
> 1) It costs a large amount of time(more than 20 hours at a time)
> 2) Can't use incremental export(every hive_table's updateTime was updated by 
> HiveBridge)
> h1. *Solution*
> When creating or updating a hive table, the transient_lastDdlTime would be 
> updated. So if the transient_lastDdlTime is the same in Atlas and metastore, 
> HiveBridge should skip updating the table's metadata.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to