[
https://issues.apache.org/jira/browse/HIVE-20262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
George Pachitariu updated HIVE-20262:
-------------------------------------
Attachment: HIVE-20262.2.patch
Status: Patch Available (was: Open)
> Implement stats annotation rule for the UDTFOperator
> ----------------------------------------------------
>
> Key: HIVE-20262
> URL: https://issues.apache.org/jira/browse/HIVE-20262
> Project: Hive
> Issue Type: Improvement
> Components: Physical Optimizer
> Reporter: George Pachitariu
> Assignee: George Pachitariu
> Priority: Minor
> Attachments: HIVE-20262.1.patch, HIVE-20262.2.patch, HIVE-20262.patch
>
>
> User Defined Table Functions (UDTFs) change the number of rows of the output.
> A common UDTF is the explode() method that creates a row for each element for
> each array in the input column.
>
> Right now, the number of output rows is equal to the number of input rows.
> But if the average number of output rows is bigger than 1, the resulting
> number of rows is underestimated in the execution plan.
>
> Implement a rule that can have a factor X as a parameter and for each UDTF
> function predict that:
>
> {code:java}
> number of output rows = X * number of input rows{code}
>
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)