Jiaqi Shan created ATLAS-3421:
---------------------------------
Summary: HiveBridge supports to skip tables without any update
Key: ATLAS-3421
URL: https://issues.apache.org/jira/browse/ATLAS-3421
Project: Atlas
Issue Type: Improvement
Components: atlas-core
Reporter: Jiaqi Shan
h1. *Background*
In my company, There are more than 80000+ hive tables in production. We will
import hive metadata to Atlas everyday to avoid any change is missed by Hive
Hook. This timed task causes two problem:
1) It costs a large amount of time(more than 20 hours at a time)
2) Can't use incremental export(every hive_table's updateTime was updated by
HiveBridge)
h1. *Solution*
When creating or updating a hive table, the transient_lastDdlTime would be
updated. So if the transient_lastDdlTime is the same in Atlas and metastore,
HiveBridge should skip updating the table's metadata.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)