----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48939/#review138860 -----------------------------------------------------------
Ship it! Ship It! - Hemanth Yamijala On June 20, 2016, 6:22 p.m., Suma Shivaprasad wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/48939/ > ----------------------------------------------------------- > > (Updated June 20, 2016, 6:22 p.m.) > > > Review request for atlas, Shwetha GS and Hemanth Yamijala. > > > Bugs: ATLAS-904 > https://issues.apache.org/jira/browse/ATLAS-904 > > > Repository: atlas > > > Description > ------- > > 1. Process qualified name = HiveOperation.name + sorted inputs + sorted > outputs > 2. HiveOperation.name doesnt provide identifiers for identiifying INSERT, > INSERT_OVERWRITE, UPDATE, DELETE etc separately . Hence adding > WriteEntity.WriteType as well which exhibits the following behaviour > a. If there are multiple outputs, for each output, adds the query > type(WriteType) > b. if query being run if is type INSERT [into/overwrite] TABLE [PARTITION], > WriteType is INSERT/INSERT_OVERWRITE > b. If query is of type INSERT OVERWRITE hdfs_path, adds WriteType as > PATH_WRITE > c. If query is of type UPDATE/DELETE, adds type as UPDATE/DELETE [ Note - > linage is not available for this since this is single table operation] > 3.When input is of type local dir or hdfs path currently, it doesnt add it to > qualified name. The reason is that partition based paths cause a lot of > processes to be created in this case instead of updating the same process. > Pending: > Address Shwetha G S suggestion to add hdfs paths to process qualified name > only in case of non-partition based queries. This needs to be done per > HiveOperation type > 1. if HiveOperation = LOAD, IMPORT, EXPORT - detect if the current query > context is dealing with partitions and do not add if it is partition based. > 2. If HiveOperation = INSERT OVERWRITE DFS_PATH/LOCAL_PATH , then detect if > the query context is dealing with a partitioned table in inputs and decide if > we need to add or not. > > > Diffs > ----- > > > addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java > c956a32 > addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java > 5d9950f > addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java > 5a175e7 > webapp/src/main/java/org/apache/atlas/web/resources/EntityResource.java > 0713d30 > > Diff: https://reviews.apache.org/r/48939/diff/ > > > Testing > ------- > > Existing tests modified to query with new qualified name. Need to add tests > for INSERT INTO TABLE > > > Thanks, > > Suma Shivaprasad > >