Re: Review Request 52077: Column level lineage in Hive

2016-09-30 Thread Shwetha GS

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/#review150990
---


Ship it!




Ship It!

- Shwetha GS


On Sept. 29, 2016, 8:15 a.m., Vimal Sharma wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52077/
> ---
> 
> (Updated Sept. 29, 2016, 8:15 a.m.)
> 
> 
> Review request for atlas.
> 
> 
> Bugs: ATLAS-1184 and ATLAS-247
> https://issues.apache.org/jira/browse/ATLAS-1184
> https://issues.apache.org/jira/browse/ATLAS-247
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> After a CTAS query, lineage relationship between source columns and 
> destination column can be captured. This information can be used to create a 
> column lineage process.
> 
> 
> Diffs
> -
> 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
>  PRE-CREATION 
>   addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
> a3464a0 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
>  45f0bc9 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java
>  e094cb6 
>   addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
> a5838b4 
> 
> Diff: https://reviews.apache.org/r/52077/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vimal Sharma
> 
>



Re: Review Request 52077: Column level lineage in Hive

2016-09-29 Thread Suma Shivaprasad


> On Sept. 21, 2016, 8:58 p.m., Suma Shivaprasad wrote:
> > addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java,
> >  line 97
> > 
> >
> > why is column qualifiedName different from the convention we are using 
> > for hive_column instances which are referred to from the table. Why is 
> > clustername removed?
> 
> Vimal Sharma wrote:
> Cluster information is not available in Lineage information provided by 
> Hive. Further, qualifiedName used in this patch is used only while setting 
> column lineage and is not used for communication with rest of Atlas codebase.
> 
> Suma Shivaprasad wrote:
> If we do not provide the same qualifiedName as in the current 
> HMSB.getColumnQualifiedName() , it will result in a another entity being 
> created for the columns. Cluster information is available in 
> HMSB.getClusterName()
> 
> Vimal Sharma wrote:
> In the function populateColumnReferenceableMap, we are setting a mapping 
> from column string identifier(named as column qualified name) to its 
> corresponding column Referenceable object in Atlas. No new column 
> Referenceable entity is created. 
> 
> Further, in buildLineageMap, we are setting a mapping from destination 
> column qualified name to list of source column qualified names. Now, in the 
> key value pairs of the type (LineageInfo.DependencyKey, 
> LineageInfo.Dependency) in LineageInfo from Hive, there is no cluster 
> information available. So here we can't use the same pattern for column 
> qualified name as used in HMSB.getColumnQualifiedName.
> 
> If we set column string identifier as HMSB.getColumnQualifiedName in 
> function populateColumnReferenceableMap, we won't be able to access the 
> column referenceable objects from the map(created in 
> populateColumnReferenceableMap) in HiveHook when we are setting up column 
> lineage process in function createColumnLineageProcessInstances(lines 803 and 
> 812).

Sounds good


- Suma


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/#review149896
---


On Sept. 29, 2016, 8:15 a.m., Vimal Sharma wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52077/
> ---
> 
> (Updated Sept. 29, 2016, 8:15 a.m.)
> 
> 
> Review request for atlas.
> 
> 
> Bugs: ATLAS-1184 and ATLAS-247
> https://issues.apache.org/jira/browse/ATLAS-1184
> https://issues.apache.org/jira/browse/ATLAS-247
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> After a CTAS query, lineage relationship between source columns and 
> destination column can be captured. This information can be used to create a 
> column lineage process.
> 
> 
> Diffs
> -
> 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
>  PRE-CREATION 
>   addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
> a3464a0 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
>  45f0bc9 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java
>  e094cb6 
>   addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
> a5838b4 
> 
> Diff: https://reviews.apache.org/r/52077/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vimal Sharma
> 
>



Re: Review Request 52077: Column level lineage in Hive

2016-09-29 Thread Vimal Sharma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/
---

(Updated Sept. 29, 2016, 8:15 a.m.)


Review request for atlas.


Changes
---

Addressed Shwetha's review comments. I think it would make sense to address 
Type update changes in ATLAS-1184. Marked ATLAS-1184 as required for this patch


Bugs: ATLAS-1184 and ATLAS-247
https://issues.apache.org/jira/browse/ATLAS-1184
https://issues.apache.org/jira/browse/ATLAS-247


Repository: atlas


Description
---

After a CTAS query, lineage relationship between source columns and destination 
column can be captured. This information can be used to create a column lineage 
process.


Diffs (updated)
-

  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
 PRE-CREATION 
  addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
a3464a0 
  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
 45f0bc9 
  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java 
e094cb6 
  addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
a5838b4 

Diff: https://reviews.apache.org/r/52077/diff/


Testing
---


Thanks,

Vimal Sharma



Re: Review Request 52077: Column level lineage in Hive

2016-09-29 Thread Shwetha GS

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/#review150821
---


Fix it, then Ship it!





addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java (line 
634)


change to warn as we continue even without it



addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java (line 
1049)


remove toString



addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java (line 
1051)


Change to warn



addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
(line 1110)


Add a comment on why its disabled and when the test can be enabled



addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
(line 1163)


add assert that vertices contains a_guid and b_guid



addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
(line 1169)


Add assert that vertices contains sourceTableGUID


- Shwetha GS


On Sept. 26, 2016, 1:06 p.m., Vimal Sharma wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52077/
> ---
> 
> (Updated Sept. 26, 2016, 1:06 p.m.)
> 
> 
> Review request for atlas.
> 
> 
> Bugs: ATLAS-1184 and ATLAS-247
> https://issues.apache.org/jira/browse/ATLAS-1184
> https://issues.apache.org/jira/browse/ATLAS-247
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> After a CTAS query, lineage relationship between source columns and 
> destination column can be captured. This information can be used to create a 
> column lineage process.
> 
> 
> Diffs
> -
> 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
>  PRE-CREATION 
>   addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
> a3464a0 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
>  45f0bc9 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java
>  e094cb6 
>   addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
> a5838b4 
> 
> Diff: https://reviews.apache.org/r/52077/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vimal Sharma
> 
>



Re: Review Request 52077: Column level lineage in Hive

2016-09-26 Thread Vimal Sharma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/
---

(Updated Sept. 26, 2016, 1:06 p.m.)


Review request for atlas.


Changes
---

Addressed Shwetha's review comments. I think it would make sense to address 
Type update changes in ATLAS-1184. Marked ATLAS-1184 as required for this patch.


Bugs: ATLAS-1184 and ATLAS-247
https://issues.apache.org/jira/browse/ATLAS-1184
https://issues.apache.org/jira/browse/ATLAS-247


Repository: atlas


Description
---

After a CTAS query, lineage relationship between source columns and destination 
column can be captured. This information can be used to create a column lineage 
process.


Diffs (updated)
-

  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
 PRE-CREATION 
  addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
a3464a0 
  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
 45f0bc9 
  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java 
e094cb6 
  addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
a5838b4 

Diff: https://reviews.apache.org/r/52077/diff/


Testing
---


Thanks,

Vimal Sharma



Re: Review Request 52077: Column level lineage in Hive

2016-09-26 Thread Shwetha GS

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/#review150383
---



1. Can you add a test with lineage query on column?
2. ReservedTypesRegistrar should change to updateTypes, so that upgrades work
3. Once you make sure the tests work, disable the tests so that tests don't 
break with apache hive 1.2


addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
 (line 334)


rename to query/command as this class type is also process


- Shwetha GS


On Sept. 22, 2016, 11:53 a.m., Vimal Sharma wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52077/
> ---
> 
> (Updated Sept. 22, 2016, 11:53 a.m.)
> 
> 
> Review request for atlas.
> 
> 
> Bugs: ATLAS-247
> https://issues.apache.org/jira/browse/ATLAS-247
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> After a CTAS query, lineage relationship between source columns and 
> destination column can be captured. This information can be used to create a 
> column lineage process.
> 
> 
> Diffs
> -
> 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
>  PRE-CREATION 
>   addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
> a3464a0 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
>  45f0bc9 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java
>  e094cb6 
>   addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
> a5838b4 
> 
> Diff: https://reviews.apache.org/r/52077/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vimal Sharma
> 
>



Re: Review Request 52077: Column level lineage in Hive

2016-09-26 Thread Vimal Sharma


> On Sept. 21, 2016, 8:58 p.m., Suma Shivaprasad wrote:
> > addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java,
> >  line 97
> > 
> >
> > why is column qualifiedName different from the convention we are using 
> > for hive_column instances which are referred to from the table. Why is 
> > clustername removed?
> 
> Vimal Sharma wrote:
> Cluster information is not available in Lineage information provided by 
> Hive. Further, qualifiedName used in this patch is used only while setting 
> column lineage and is not used for communication with rest of Atlas codebase.
> 
> Suma Shivaprasad wrote:
> If we do not provide the same qualifiedName as in the current 
> HMSB.getColumnQualifiedName() , it will result in a another entity being 
> created for the columns. Cluster information is available in 
> HMSB.getClusterName()

In the function populateColumnReferenceableMap, we are setting a mapping from 
column string identifier(named as column qualified name) to its corresponding 
column Referenceable object in Atlas. No new column Referenceable entity is 
created. 

Further, in buildLineageMap, we are setting a mapping from destination column 
qualified name to list of source column qualified names. Now, in the key value 
pairs of the type (LineageInfo.DependencyKey, LineageInfo.Dependency) in 
LineageInfo from Hive, there is no cluster information available. So here we 
can't use the same pattern for column qualified name as used in 
HMSB.getColumnQualifiedName.

If we set column string identifier as HMSB.getColumnQualifiedName in function 
populateColumnReferenceableMap, we won't be able to access the column 
referenceable objects from the map(created in populateColumnReferenceableMap) 
in HiveHook when we are setting up column lineage process in function 
createColumnLineageProcessInstances(lines 803 and 812).


- Vimal


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/#review149896
---


On Sept. 22, 2016, 11:53 a.m., Vimal Sharma wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52077/
> ---
> 
> (Updated Sept. 22, 2016, 11:53 a.m.)
> 
> 
> Review request for atlas.
> 
> 
> Bugs: ATLAS-247
> https://issues.apache.org/jira/browse/ATLAS-247
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> After a CTAS query, lineage relationship between source columns and 
> destination column can be captured. This information can be used to create a 
> column lineage process.
> 
> 
> Diffs
> -
> 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
>  PRE-CREATION 
>   addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
> a3464a0 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
>  45f0bc9 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java
>  e094cb6 
>   addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
> a5838b4 
> 
> Diff: https://reviews.apache.org/r/52077/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vimal Sharma
> 
>



Re: Review Request 52077: Column level lineage in Hive

2016-09-23 Thread Suma Shivaprasad

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/#review150215
---




addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
 (line 101)


should be replaced with HMSB.getTableQualifiedName


- Suma Shivaprasad


On Sept. 22, 2016, 11:53 a.m., Vimal Sharma wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52077/
> ---
> 
> (Updated Sept. 22, 2016, 11:53 a.m.)
> 
> 
> Review request for atlas.
> 
> 
> Bugs: ATLAS-247
> https://issues.apache.org/jira/browse/ATLAS-247
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> After a CTAS query, lineage relationship between source columns and 
> destination column can be captured. This information can be used to create a 
> column lineage process.
> 
> 
> Diffs
> -
> 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
>  PRE-CREATION 
>   addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
> a3464a0 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
>  45f0bc9 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java
>  e094cb6 
>   addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
> a5838b4 
> 
> Diff: https://reviews.apache.org/r/52077/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vimal Sharma
> 
>



Re: Review Request 52077: Column level lineage in Hive

2016-09-23 Thread Suma Shivaprasad


> On Sept. 21, 2016, 8:58 p.m., Suma Shivaprasad wrote:
> > addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java,
> >  line 97
> > 
> >
> > why is column qualifiedName different from the convention we are using 
> > for hive_column instances which are referred to from the table. Why is 
> > clustername removed?
> 
> Vimal Sharma wrote:
> Cluster information is not available in Lineage information provided by 
> Hive. Further, qualifiedName used in this patch is used only while setting 
> column lineage and is not used for communication with rest of Atlas codebase.

If we do not provide the same qualifiedName as in the current 
HMSB.getColumnQualifiedName() , it will result in a another entity being 
created for the columns. Cluster information is available in 
HMSB.getClusterName()


- Suma


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/#review149896
---


On Sept. 22, 2016, 11:53 a.m., Vimal Sharma wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52077/
> ---
> 
> (Updated Sept. 22, 2016, 11:53 a.m.)
> 
> 
> Review request for atlas.
> 
> 
> Bugs: ATLAS-247
> https://issues.apache.org/jira/browse/ATLAS-247
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> After a CTAS query, lineage relationship between source columns and 
> destination column can be captured. This information can be used to create a 
> column lineage process.
> 
> 
> Diffs
> -
> 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
>  PRE-CREATION 
>   addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
> a3464a0 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
>  45f0bc9 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java
>  e094cb6 
>   addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
> a5838b4 
> 
> Diff: https://reviews.apache.org/r/52077/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vimal Sharma
> 
>



Re: Review Request 52077: Column level lineage in Hive

2016-09-22 Thread Vimal Sharma


> On Sept. 21, 2016, 8:58 p.m., Suma Shivaprasad wrote:
> > addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java,
> >  line 97
> > 
> >
> > why is column qualifiedName different from the convention we are using 
> > for hive_column instances which are referred to from the table. Why is 
> > clustername removed?

Cluster information is not available in Lineage information provided by Hive. 
Further, qualifiedName used in this patch is used only while setting column 
lineage and is not used for communication with rest of Atlas codebase.


- Vimal


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/#review149896
---


On Sept. 20, 2016, 9:07 a.m., Vimal Sharma wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52077/
> ---
> 
> (Updated Sept. 20, 2016, 9:07 a.m.)
> 
> 
> Review request for atlas.
> 
> 
> Bugs: ATLAS-247
> https://issues.apache.org/jira/browse/ATLAS-247
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> After a CTAS query, lineage relationship between source columns and 
> destination column can be captured. This information can be used to create a 
> column lineage process.
> 
> 
> Diffs
> -
> 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
>  PRE-CREATION 
>   addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
> a3464a0 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
>  45f0bc9 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java
>  e094cb6 
>   addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
> a5838b4 
> 
> Diff: https://reviews.apache.org/r/52077/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vimal Sharma
> 
>



Re: Review Request 52077: Column level lineage in Hive

2016-09-22 Thread Vimal Sharma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/
---

(Updated Sept. 22, 2016, 11:53 a.m.)


Review request for atlas.


Changes
---

Fixed review comments from Shwetha and Suma. Added try catch to make sure that 
Atlas doesn't fail to register CTAS process when Column lineage can't be set 
due to absence of lineage data from Hive Hook.


Bugs: ATLAS-247
https://issues.apache.org/jira/browse/ATLAS-247


Repository: atlas


Description
---

After a CTAS query, lineage relationship between source columns and destination 
column can be captured. This information can be used to create a column lineage 
process.


Diffs (updated)
-

  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
 PRE-CREATION 
  addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
a3464a0 
  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
 45f0bc9 
  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java 
e094cb6 
  addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
a5838b4 

Diff: https://reviews.apache.org/r/52077/diff/


Testing
---


Thanks,

Vimal Sharma



Re: Review Request 52077: Column level lineage in Hive

2016-09-22 Thread Vimal Sharma


> On Sept. 21, 2016, 9:01 p.m., Suma Shivaprasad wrote:
> > addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java, 
> > line 628
> > 
> >
> > shouldnt the lineage process refer to the same set of column instances 
> > which are already part of a table reference. Why are we recreating them?

Column Referenceable instances are not re-created. Column Referenceable 
instances which are part of table reference are stored in Map columnQNameToRef 
and then this map is used in function createColumnLineageObjects to set column 
lineage process.

The name createColumnLineageObjects is misleading though. I have changed it to 
createColumnLineageProcessInstances


- Vimal


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/#review149897
---


On Sept. 20, 2016, 9:07 a.m., Vimal Sharma wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52077/
> ---
> 
> (Updated Sept. 20, 2016, 9:07 a.m.)
> 
> 
> Review request for atlas.
> 
> 
> Bugs: ATLAS-247
> https://issues.apache.org/jira/browse/ATLAS-247
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> After a CTAS query, lineage relationship between source columns and 
> destination column can be captured. This information can be used to create a 
> column lineage process.
> 
> 
> Diffs
> -
> 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
>  PRE-CREATION 
>   addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
> a3464a0 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
>  45f0bc9 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java
>  e094cb6 
>   addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
> a5838b4 
> 
> Diff: https://reviews.apache.org/r/52077/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vimal Sharma
> 
>



Re: Review Request 52077: Column level lineage in Hive

2016-09-21 Thread Suma Shivaprasad

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/#review149896
---




addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
 (line 94)


Use constant for "columns". Also can remove .getValuesMap .get should work 
on Referenceable



addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
 (line 97)


why is column qualifiedName different from the convention we are using for 
hive_column instances which are referred to from the table. Why is clustername 
removed?


- Suma Shivaprasad


On Sept. 20, 2016, 9:07 a.m., Vimal Sharma wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52077/
> ---
> 
> (Updated Sept. 20, 2016, 9:07 a.m.)
> 
> 
> Review request for atlas.
> 
> 
> Bugs: ATLAS-247
> https://issues.apache.org/jira/browse/ATLAS-247
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> After a CTAS query, lineage relationship between source columns and 
> destination column can be captured. This information can be used to create a 
> column lineage process.
> 
> 
> Diffs
> -
> 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
>  PRE-CREATION 
>   addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
> a3464a0 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
>  45f0bc9 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java
>  e094cb6 
>   addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
> a5838b4 
> 
> Diff: https://reviews.apache.org/r/52077/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vimal Sharma
> 
>



Re: Review Request 52077: Column level lineage in Hive

2016-09-21 Thread Shwetha GS

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/#review149804
---




addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
 (line 73)


Don't add toString() to arguments



addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java (line 
634)


Why have you changed Entity update request to entity create request? It 
should be update



addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java (line 
819)


e.key() is the output column name? e.key doesn't contain cluster name, so 
might get de-duped across other lineage processes.

Qualified name for lineage process should be
'qualified name of hive command process : column name'.



addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java (line 
1044)


Use syntax: LOG.debug("Column Lineage Map  - {}", 
this.lineageInfo.entrySet()); so that toString() is not evaluated if debug log 
level is not enabled



addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
(line 1121)


rename to testColumnLevelLineage



addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
(line 1143)


Also assert on lineage API response on columns


- Shwetha GS


On Sept. 20, 2016, 9:07 a.m., Vimal Sharma wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52077/
> ---
> 
> (Updated Sept. 20, 2016, 9:07 a.m.)
> 
> 
> Review request for atlas.
> 
> 
> Bugs: ATLAS-247
> https://issues.apache.org/jira/browse/ATLAS-247
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> After a CTAS query, lineage relationship between source columns and 
> destination column can be captured. This information can be used to create a 
> column lineage process.
> 
> 
> Diffs
> -
> 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
>  PRE-CREATION 
>   addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
> a3464a0 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
>  45f0bc9 
>   
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java
>  e094cb6 
>   addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
> a5838b4 
> 
> Diff: https://reviews.apache.org/r/52077/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vimal Sharma
> 
>



Re: Review Request 52077: Column level lineage in Hive

2016-09-20 Thread Vimal Sharma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/
---

(Updated Sept. 20, 2016, 9:07 a.m.)


Review request for atlas.


Changes
---

Newly added file was not available in the previous patch. Updating the patch


Bugs: ATLAS-247
https://issues.apache.org/jira/browse/ATLAS-247


Repository: atlas


Description
---

After a CTAS query, lineage relationship between source columns and destination 
column can be captured. This information can be used to create a column lineage 
process.


Diffs (updated)
-

  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/ColumnLineageUtils.java
 PRE-CREATION 
  addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
a3464a0 
  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
 45f0bc9 
  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java 
e094cb6 
  addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
a5838b4 

Diff: https://reviews.apache.org/r/52077/diff/


Testing
---


Thanks,

Vimal Sharma



Review Request 52077: Column level lineage in Hive

2016-09-20 Thread Vimal Sharma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52077/
---

Review request for atlas.


Repository: atlas


Description
---

After a CTAS query, lineage relationship between source columns and destination 
column can be captured. This information can be used to create a column lineage 
process.


Diffs
-

  addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
a3464a0 
  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataModelGenerator.java
 45f0bc9 
  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/model/HiveDataTypes.java 
e094cb6 
  addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java 
a5838b4 

Diff: https://reviews.apache.org/r/52077/diff/


Testing
---


Thanks,

Vimal Sharma