[
https://issues.apache.org/jira/browse/HIVE-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777702#action_12777702
]
Paul Yang commented on HIVE-655:
--------------------------------
So I had a discussion with Ning and Namit this morning and a slightly different
syntax for UDTF's was proposed. Something like:
{code}
SELECT pageid, adid FROM myTable LATERAL VIEW explode(adid_list) AS adid ;
{code}
where the LATERAL VIEW keyword associates the given UDTF with the table in the
FROM clause. As Ning pointed out, one of the issues with having the UDTF in the
SELECT is that queries like the following
{code}
SELECT pageid, explode(adid_list), count(1) FROM myTable GROUP BY pageid;
{code}
are a bit confusing as it's not clear what it's supposed to do. We could
disallow these sort of operations but it makes it more complicated to the user.
Using LATERAL VIEW also handles Raghotham's concern about having to specify the
input for the UDTF. The UDTF still returns one column, thought multiple values
can be returned via a an array or a struct.
Zheng, do you have any thoughts about the proposed syntax? I know from early on
UDTF's were planned to be in the SELECT clause and I'm wondering if there were
other reasons for why UDTF's should be there. With SELECT, it seemed more
straightforward implementation-wise. Also, going back to TRANSFORM, it does
seem like it can fit in FROM too. What was the rationale for having it in the
SELECT?
> Add support for user defined table generating functions
> -------------------------------------------------------
>
> Key: HIVE-655
> URL: https://issues.apache.org/jira/browse/HIVE-655
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Query Processor
> Reporter: Raghotham Murthy
> Assignee: Paul Yang
> Attachments: HIVE-655.1.patch, HIVE-655.2.patch
>
>
> Provide a way for users to add a table generating function, i.e., functions
> that generate multiple rows from a single input row. Currently, the only way
> to do it is via the TRANSFORM clause which requires streaming the data.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.