Jiaqi Shan created ATLAS-3421:
---------------------------------

             Summary: HiveBridge supports to skip tables without any update
                 Key: ATLAS-3421
                 URL: https://issues.apache.org/jira/browse/ATLAS-3421
             Project: Atlas
          Issue Type: Improvement
          Components:  atlas-core
            Reporter: Jiaqi Shan


h1. *Background*

In my company, There are more than 80000+ hive tables in production.  We will 
import hive metadata to Atlas everyday to avoid any change is missed by Hive 
Hook. This timed task causes two problem:

1) It costs a large amount of time(more than 20 hours at a time)

2) Can't use incremental export(every hive_table's updateTime was updated by 
HiveBridge)
h1. *Solution*

When creating or updating a hive table, the transient_lastDdlTime would be 
updated. So if the transient_lastDdlTime is the same in Atlas and metastore, 
HiveBridge should skip updating the table's metadata.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to