Sidharth Kumar Mishra created HIVE-22301:
--------------------------------------------
Summary: Hive lineage is not generated for insert overwrite
queries on partitioned tables
Key: HIVE-22301
URL: https://issues.apache.org/jira/browse/HIVE-22301
Project: Hive
Issue Type: Bug
Reporter: Sidharth Kumar Mishra
Attachments: ScreenShot HookContext.png, ScreenShot
RunPostExecHook.png, ScreenShot runBeforeExecution.png
Problem: When I run the below mentioned queries, the last query should have
given the proper hive lineage info (through HookContext) from table_b to
table_t.
* Create table table_t (id int) partitioned by (dob date);
* Create table table_b (id int) partitioned by (dob date);
* from table_b a insert overwrite table table_t select a.id,a.dob;
Note : for CTAS query from a partitioned table , this issue is not seen. Only
for insert queries like insert into <table> select * from <table> and query
like above, issue is seen.
This issue is seen in latest HDP builds as well.
Technical Observations:
At HookContext (passed from hive.ql.Driver to Hive Hook of Atlas through
hookRunner.runPostExecHooks call) contains no outputs. Check below screenshot
from IntelliJ.
!ScreenShot RunPostExecHook.png!
I found that the PrivateHookContext is getting created with proper outputs
value as shown below initially:
!ScreenShot HookContext.png!
The same is passed properly to runBeforeExecutionHook as shown below:
!ScreenShot runBeforeExecution.png!
Later when we pass HookContext to runPostExecHooks, there is no output
populated. Kindly check the reason and let me know if you need any further
information from my end.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)