> On June 20, 2016, 9:05 a.m., Hemanth Yamijala wrote:
> > addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java,
> > line 584
> > <https://reviews.apache.org/r/48939/diff/2/?file=1423788#file1423788line584>
> >
> > Is it safe to rely on the equals / hashcode of Entity to serve as key?
> > If you've analyzed this and feel it is fine, please do close the issue.
Yes it uses name for equals/hashCode which is a qualified name
private String computeName() {
switch (typ) {
case DATABASE:
return "database:" + database.getName();
case TABLE:
return t.getDbName() + "@" + t.getTableName();
case PARTITION:
return t.getDbName() + "@" + t.getTableName() + "@" + p.getName();
case DUMMYPARTITION:
return p.getName();
case FUNCTION:
if (database != null) {
return database.getName() + "." + stringObject;
}
return stringObject;
default:
return d.toString();
}
- Suma
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48939/#review138554
-----------------------------------------------------------
On June 20, 2016, 4 a.m., Suma Shivaprasad wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48939/
> -----------------------------------------------------------
>
> (Updated June 20, 2016, 4 a.m.)
>
>
> Review request for atlas, Shwetha GS and Hemanth Yamijala.
>
>
> Bugs: ATLAS-904
> https://issues.apache.org/jira/browse/ATLAS-904
>
>
> Repository: atlas
>
>
> Description
> -------
>
> 1. Process qualified name = HiveOperation.name + sorted inputs + sorted
> outputs
> 2. HiveOperation.name doesnt provide identifiers for identiifying INSERT,
> INSERT_OVERWRITE, UPDATE, DELETE etc separately . Hence adding
> WriteEntity.WriteType as well which exhibits the following behaviour
> a. If there are multiple outputs, for each output, adds the query
> type(WriteType)
> b. if query being run if is type INSERT [into/overwrite] TABLE [PARTITION],
> WriteType is INSERT/INSERT_OVERWRITE
> b. If query is of type INSERT OVERWRITE hdfs_path, adds WriteType as
> PATH_WRITE
> c. If query is of type UPDATE/DELETE, adds type as UPDATE/DELETE [ Note -
> linage is not available for this since this is single table operation]
> 3.When input is of type local dir or hdfs path currently, it doesnt add it to
> qualified name. The reason is that partition based paths cause a lot of
> processes to be created in this case instead of updating the same process.
> Pending:
> Address Shwetha G S suggestion to add hdfs paths to process qualified name
> only in case of non-partition based queries. This needs to be done per
> HiveOperation type
> 1. if HiveOperation = LOAD, IMPORT, EXPORT - detect if the current query
> context is dealing with partitions and do not add if it is partition based.
> 2. If HiveOperation = INSERT OVERWRITE DFS_PATH/LOCAL_PATH , then detect if
> the query context is dealing with a partitioned table in inputs and decide if
> we need to add or not.
>
>
> Diffs
> -----
>
>
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
> c956a32
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java
> 23c82df
> addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java
> e7fbf71
> webapp/src/main/java/org/apache/atlas/web/resources/EntityResource.java
> 0713d30
>
> Diff: https://reviews.apache.org/r/48939/diff/
>
>
> Testing
> -------
>
> Existing tests modified to query with new qualified name. Need to add tests
> for INSERT INTO TABLE
>
>
> Thanks,
>
> Suma Shivaprasad
>
>