[ 
https://issues.apache.org/jira/browse/ATLAS-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15924301#comment-15924301
 ] 

Sharmadha Sainath commented on ATLAS-1661:
------------------------------------------

[~ayubkhan] 
>> I believe the intent of import-hive.sh tool is not to track all the metadata 
>> changes but to take a snapshot of the metadata at that point of time.
I completely agree with you. That is the intent. Currently import-hive script 
looks for the qualified name of the table if it is already present , and 
updates the same . In the case mentioned in the description , it doesn't find 
the table tablenew so it creates a new table. But it would be a good to have 
feature if import hive script could have a mechanism to know the history of the 
tablenew and update accordingly.

>> Are you suggesting to have the hiveHook capability built into import-hive.sh 
>> tool also?
Yes , because that would be the expectation from the customer. Only difference 
customer would know is , hive hook updates as and when query is fired , and 
import hive script does bunch update when run.

> import hive script to handle updates like rename/delete
> -------------------------------------------------------
>
>                 Key: ATLAS-1661
>                 URL: https://issues.apache.org/jira/browse/ATLAS-1661
>             Project: Atlas
>          Issue Type: Improvement
>          Components: atlas-intg
>            Reporter: Sharmadha Sainath
>            Priority: Minor
>
> 1. Disabled hive hook
> 2. Created table table1
> 3. Ran import-hive.sh script , Atlas ingested table1.
> 4. Altered table table1 , rename to table1_new.
> 5. Ran import-hive.sh script , Atlas created a new table table1new .
> table1 wasn't updated with new name.
> This is the expected behavior with import-hive script as opposed to hive 
> hook, as hive hook is synchronous and import-hive is not.
> But as a customer , running import-hive.sh multiple times and doing many hive 
> operations may result in inconsistency while applying ranger policies to the 
> table and in many scenarios , since it is not documented to run import hive 
> script only once. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to