[
https://issues.apache.org/jira/browse/ATLAS-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jiaqi Shan reassigned ATLAS-3421:
---------------------------------
Assignee: Jiaqi Shan
> HiveBridge supports to skip tables without any update
> -----------------------------------------------------
>
> Key: ATLAS-3421
> URL: https://issues.apache.org/jira/browse/ATLAS-3421
> Project: Atlas
> Issue Type: Improvement
> Components: atlas-core
> Reporter: Jiaqi Shan
> Assignee: Jiaqi Shan
> Priority: Major
> Attachments: ATLAS-3421.patch
>
>
> h1. *Background*
> In my company, There are more than 80000+ hive tables in production. We will
> import hive metadata to Atlas everyday to avoid any change is missed by Hive
> Hook. This timed task causes two problem:
> 1) It costs a large amount of time(more than 20 hours at a time)
> 2) Can't use incremental export(every hive_table's updateTime was updated by
> HiveBridge)
> h1. *Solution*
> When creating or updating a hive table, the transient_lastDdlTime would be
> updated. So if the transient_lastDdlTime is the same in Atlas and metastore,
> HiveBridge should skip updating the table's metadata.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)