-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48939/#review138860
-----------------------------------------------------------


Ship it!




Ship It!

- Hemanth Yamijala


On June 20, 2016, 6:22 p.m., Suma Shivaprasad wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48939/
> -----------------------------------------------------------
> 
> (Updated June 20, 2016, 6:22 p.m.)
> 
> 
> Review request for atlas, Shwetha GS and Hemanth Yamijala.
> 
> 
> Bugs: ATLAS-904
>     https://issues.apache.org/jira/browse/ATLAS-904
> 
> 
> Repository: atlas
> 
> 
> Description
> -------
> 
> 1. Process qualified name = HiveOperation.name + sorted inputs + sorted 
> outputs
> 2. HiveOperation.name doesnt provide identifiers for identiifying INSERT, 
> INSERT_OVERWRITE, UPDATE, DELETE etc separately . Hence adding 
> WriteEntity.WriteType as well which exhibits the following behaviour
> a. If there are multiple outputs, for each output, adds the query 
> type(WriteType)
> b. if query being run if is type INSERT [into/overwrite] TABLE [PARTITION], 
> WriteType is INSERT/INSERT_OVERWRITE
> b. If query is of type INSERT OVERWRITE hdfs_path, adds WriteType as 
> PATH_WRITE
> c. If query is of type UPDATE/DELETE, adds type as UPDATE/DELETE [ Note - 
> linage is not available for this since this is single table operation]
> 3.When input is of type local dir or hdfs path currently, it doesnt add it to 
> qualified name. The reason is that partition based paths cause a lot of 
> processes to be created in this case instead of updating the same process.
> Pending:
> Address Shwetha G S suggestion to add hdfs paths to process qualified name 
> only in case of non-partition based queries. This needs to be done per 
> HiveOperation type
> 1. if HiveOperation = LOAD, IMPORT, EXPORT - detect if the current query 
> context is dealing with partitions and do not add if it is partition based.
> 2. If HiveOperation = INSERT OVERWRITE DFS_PATH/LOCAL_PATH , then detect if 
> the query context is dealing with a partitioned table in inputs and decide if 
> we need to add or not.
> 
> 
> Diffs
> -----
> 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
>  c956a32 
>   addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
> 5d9950f 
>   addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
> 5a175e7 
>   webapp/src/main/java/org/apache/atlas/web/resources/EntityResource.java 
> 0713d30 
> 
> Diff: https://reviews.apache.org/r/48939/diff/
> 
> 
> Testing
> -------
> 
> Existing tests modified to query with new qualified name. Need to add tests 
> for INSERT INTO TABLE
> 
> 
> Thanks,
> 
> Suma Shivaprasad
> 
>

Reply via email to