-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48939/
-----------------------------------------------------------

(Updated June 20, 2016, 5:27 p.m.)


Review request for atlas, Shwetha GS and Hemanth Yamijala.


Changes
-------

Thanks for reviewing Hemanth. Fixed review comments. Please reopen any issue 
which I have dropped if you feel it should be addressed or if you have any more 
questions.


Bugs: ATLAS-904
    https://issues.apache.org/jira/browse/ATLAS-904


Repository: atlas


Description
-------

1. Process qualified name = HiveOperation.name + sorted inputs + sorted outputs
2. HiveOperation.name doesnt provide identifiers for identiifying INSERT, 
INSERT_OVERWRITE, UPDATE, DELETE etc separately . Hence adding 
WriteEntity.WriteType as well which exhibits the following behaviour
a. If there are multiple outputs, for each output, adds the query 
type(WriteType)
b. if query being run if is type INSERT [into/overwrite] TABLE [PARTITION], 
WriteType is INSERT/INSERT_OVERWRITE
b. If query is of type INSERT OVERWRITE hdfs_path, adds WriteType as PATH_WRITE
c. If query is of type UPDATE/DELETE, adds type as UPDATE/DELETE [ Note - 
linage is not available for this since this is single table operation]
3.When input is of type local dir or hdfs path currently, it doesnt add it to 
qualified name. The reason is that partition based paths cause a lot of 
processes to be created in this case instead of updating the same process.
Pending:
Address Shwetha G S suggestion to add hdfs paths to process qualified name only 
in case of non-partition based queries. This needs to be done per HiveOperation 
type
1. if HiveOperation = LOAD, IMPORT, EXPORT - detect if the current query 
context is dealing with partitions and do not add if it is partition based.
2. If HiveOperation = INSERT OVERWRITE DFS_PATH/LOCAL_PATH , then detect if the 
query context is dealing with a partitioned table in inputs and decide if we 
need to add or not.


Diffs (updated)
-----

  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
 c956a32 
  addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
5d9950f 
  addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
5a175e7 
  webapp/src/main/java/org/apache/atlas/web/resources/EntityResource.java 
0713d30 

Diff: https://reviews.apache.org/r/48939/diff/


Testing
-------

Existing tests modified to query with new qualified name. Need to add tests for 
INSERT INTO TABLE


Thanks,

Suma Shivaprasad

Reply via email to