[
https://issues.apache.org/jira/browse/HIVE-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851718#action_12851718
]
Zheng Shao commented on HIVE-1131:
----------------------------------
> Look at the DataContainer class. That has a partition in it. And the
> Dependency has a mapping from Partition to the dependencies. Can you explain
> more your concerns on inefficiency?
I see. So the DataContainer captures the output partition information, but we
don't have input partition information (BaseColumnInfo/TableAliasInfo). This is
reasonable since the input can be lots of partitions.
> For S6 actually the queryplan is the wrong place to store the lineageinfo.
> Because of the dynamic partitioning work that Ning is doing, I have to
> generate the partition to dependency mapping at run time. So I would rather
> store it in a run time structure as opposed to a compile time structure.
> SessionState fits that bill, though I think we should have another structure
> called ExecutionCtx for this. But otherwise I think we want to store this in
> a runtime structure.
+1 on the ExecutionCtx idea. SessionState is at the session level, and
LineageInfo is at the query level. It will be great to put LineageInfo into
ExecutionCtx.
> Add column lineage information to the pre execution hooks
> ---------------------------------------------------------
>
> Key: HIVE-1131
> URL: https://issues.apache.org/jira/browse/HIVE-1131
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Query Processor
> Reporter: Ashish Thusoo
> Assignee: Ashish Thusoo
> Attachments: HIVE-1131.patch, HIVE-1131_2.patch, HIVE-1131_3.patch,
> HIVE-1131_4.patch
>
>
> We need a mechanism to pass the lineage information of the various columns of
> a table to a pre execution hook so that applications can use that for:
> - auditing
> - dependency checking
> and many other applications.
> The proposal is to expose this through a bunch of classes to the pre
> execution hook interface to the clients and put in the necessary
> transformation logic in the optimizer to generate this information.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.