[
https://issues.apache.org/jira/browse/HIVE-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743934#action_12743934
]
Zheng Shao commented on HIVE-655:
---------------------------------
Another way is like this:
1. Simplest example: just use the same syntax as UDF.
{code}
SELECT pageid, EXPLODE(adid_list) as adid
FROM mytable;
{code}
2. If the UDTF produces more than 1 columns, then we have 3 options:
{code}
A. Simplest way: needs common sub expression elimination to achieve good
performance
SELECT pageid, EXPLODE(ad_list).adid AS adid, EXPLODE(ad_list).adtext AS
adtext
FROM mytable;
B. Simplify the query using sub query:
SELECT pageid, ad.adid AS adid, ad.adtext AS adtext
FROM (SELECT pageid, EXPLODE(ad_list) AS ad
FROM mytable) a;
C. Expand the structure inline:
SELECT pageid, EXPLODE(ad_list) as (adid, adtext)
FROM mytable;
{code}
Hive already have support for B. For A, we need to do the common sub
expression, but I guess we want to do it anyway.
C seems a nice extension but it is not limited to UDTF - UDF/UDAF should
support the same thing, if we want to support this.
3. Parallel UDTF calls means cross product:
{code}
SELECT pageid, EXPLODE(adid_list) AS adid, EXPLODE(link_list) AS link
FROM mytable;
{code}
> Add support for user defined table generating functions
> -------------------------------------------------------
>
> Key: HIVE-655
> URL: https://issues.apache.org/jira/browse/HIVE-655
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Query Processor
> Reporter: Raghotham Murthy
> Assignee: Raghotham Murthy
>
> Provide a way for users to add a table generating function, i.e., functions
> that generate multiple rows from a single input row. Currently, the only way
> to do it is via the TRANSFORM clause which requires streaming the data.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.