[
https://issues.apache.org/jira/browse/HIVE-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740329#action_12740329
]
Adam Kramer commented on HIVE-431:
----------------------------------
Another note: This isn't done or represented in normal SQL at all because
normal SQL allows for updates--so the SELECT query that generated the table's
data could quickly become obsolete. Not so with Hive!
This is also very useful metadata to search, as it lets us cross-index tables
to know which tables feed which other tables. This will help us detect
aggregate tables (to avoid re-aggregation) and to identify dependencies among
tables (so if we change a given table's contents, we will have a good guess at
who will be affected)
> Auto-add table property "select" to be the select statement that created the
> table
> ----------------------------------------------------------------------------------
>
> Key: HIVE-431
> URL: https://issues.apache.org/jira/browse/HIVE-431
> Project: Hadoop Hive
> Issue Type: Wish
> Reporter: Adam Kramer
>
> A syntactic copy of the query that was used to fill a table would often be
> AMAZINGLY useful for figuring out where the data in the table came from.
> I think the best way to implement this would be to automatically add a table
> property which includes the SELECT statement. For partitioned tables, this
> would need to exist for each partition...or perhaps use some canonical name
> like selectquery for unpartitioned tables, plus selectquery_ds=<DATEID> for
> partitioned tables.
> This problem is growing as more and more tables in our database are generated
> by either "root" or by people who are no longer easy to contact.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.