[ 
https://issues.apache.org/jira/browse/ATLAS-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated ATLAS-415:
-----------------------------------
    Attachment: ATLAS-415.patch

Attaching a patch for quick review.

The main fix is in using the API {{AtlasClient.updateEntity}} when we find a 
table is already registered with Atlas. The rest of the changes are to assist a 
unit test I wrote {{HiveMetaStoreBridgeTest}} and some refactoring.

With this patch, hive-import works for the case I described in the bug and 
updates the created table properly.

Couple of points that I want to call out to discuss in review:

* This might add additional calls to the server even when there's absolutely no 
change to the entity. Guess this will have a performance impact, but I am 
unsure how we can detect if there's any change on the client.
* Currently, I am doing the update only for Tables. Is this needed for DB and 
partitions as well? (I guess yes)

> Hive import fails when importing a table that is already imported without 
> StorageDescriptor information
> -------------------------------------------------------------------------------------------------------
>
>                 Key: ATLAS-415
>                 URL: https://issues.apache.org/jira/browse/ATLAS-415
>             Project: Atlas
>          Issue Type: Bug
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>         Attachments: ATLAS-415.patch
>
>
> I found this when testing patches that integrate Storm with Atlas, but guess 
> this may occur in other scenarios as well.
> To reproduce:
> * Run a storm topology with Atlas Hook enabled that has a HiveBolt (requires 
> patches for ATLAS-181 and friends).
> * Run hive-import following the above.
> The first step creates a Hive DB and table setting just the required 
> attributes. Note that the StorageDescriptor is an optional attribute as per 
> the Hive DataModel now. 
> The second step fails with this exception:
> {code}
> Exception in thread "main" java.lang.NullPointerException
>       at 
> org.apache.atlas.hive.bridge.HiveMetaStoreBridge.getSDForTable(HiveMetaStoreBridge.java:345)
>       at 
> org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importTables(HiveMetaStoreBridge.java:219)
>       at 
> org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importDatabases(HiveMetaStoreBridge.java:104)
>       at 
> org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importHiveMetadata(HiveMetaStoreBridge.java:96)
>       at 
> org.apache.atlas.hive.bridge.HiveMetaStoreBridge.main(HiveMetaStoreBridge.java:503)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to