[
https://issues.apache.org/jira/browse/SPARK-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253579#comment-14253579
]
William Benton commented on SPARK-4867:
---------------------------------------
[~marmbrus] I actually think exposing an interface that looks something like
overloading might be the right approach. (To be clear, I think polymorphism
poses a far greater difficulty with implicit coercion than without it, but it
might be possible to solve the ambiguity there by letting users register
functions in a priority order.)
> UDF clean up
> ------------
>
> Key: SPARK-4867
> URL: https://issues.apache.org/jira/browse/SPARK-4867
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Reporter: Michael Armbrust
> Priority: Blocker
>
> Right now our support and internal implementation of many functions has a few
> issues. Specifically:
> - UDFS don't know their input types and thus don't do type coercion.
> - We hard code a bunch of built in functions into the parser. This is bad
> because in SQL it creates new reserved words for things that aren't actually
> keywords. Also it means that for each function we need to add support to
> both SQLContext and HiveContext separately.
> For this JIRA I propose we do the following:
> - Change the interfaces for registerFunction and ScalaUdf to include types
> for the input arguments as well as the output type.
> - Add a rule to analysis that does type coercion for UDFs.
> - Add a parse rule for functions to SQLParser.
> - Rewrite all the UDFs that are currently hacked into the various parsers
> using this new functionality.
> Depending on how big this refactoring becomes we could split parts 1&2 from
> part 3 above.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]