[jira] Commented: (HIVE-1131) Add column lineage information to the pre execution hooks

Zheng Shao (JIRA) Fri, 19 Feb 2010 18:15:50 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836103#action_12836103
 ]


Zheng Shao commented on HIVE-1131:
----------------------------------

S1. Can we make lineage partition-level instead of table-level?
S2. We might want to define formally the concepts of these levels, especially 
how they are composited (What will be UDAF of UDF, or UDF of UDAF, like 
round(sum(col)), or sum(round(col)))
{code}
+  /**
+   * Enum to track dependency. This enum has two values:
+   * 1. SCALAR - Indicates that the column is derived from a scalar expression.
+   * 2. AGGREGATION - Indicates that the column is derived from an aggregation.
+   */
+  public static enum DependencyType {
+    SIMPLE, UDF, UDAF, UDTF, SCRIPT, SET
+  }
+  
{code}

S3. Use "{}" even for single statement in "if", "for" etc.
S4. Use "ArrayList" instead of "Vector" when it's accessed by a single thread.
S5. Remove "private HashMap<FileSinkOperator, Table> fopToTable;" since it's 
not used.


> Add column lineage information to the pre execution hooks
> ---------------------------------------------------------
>
>                 Key: HIVE-1131
>                 URL: https://issues.apache.org/jira/browse/HIVE-1131
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>         Attachments: HIVE-1131.patch
>
>
> We need a mechanism to pass the lineage information of the various columns of 
> a table to a pre execution hook so that applications can use that for:
> - auditing
> - dependency checking
> and many other applications.
> The proposal is to expose this through a bunch of classes to the pre 
> execution hook interface to the clients and put in the necessary 
> transformation logic in the optimizer to generate this information.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1131) Add column lineage information to the pre execution hooks

Reply via email to