Michael Chirico created SPARK-26331:
---------------------------------------

             Summary: Allow SQL UDF registration to recognize default function 
values from Scala
                 Key: SPARK-26331
                 URL: https://issues.apache.org/jira/browse/SPARK-26331
             Project: Spark
          Issue Type: Improvement
          Components: PySpark, SQL
    Affects Versions: 2.4.0
            Reporter: Michael Chirico


As described here:

[https://stackoverflow.com/q/53702727/3576984]

I have a UDF I would like to be flexible enough to accept 3 arguments (or in 
general n+k), but for the most part, only 2 (in general, n) are required. The 
natural approach to this is to implement the UDF with 3 arguments, one of which 
has a standard default value.

Copying a toy example from SO:

{{package myUDFs import org.apache.spark.sql.api.java.UDF3 class my_udf extends 
UDF3[Int, Int, Int, Int] { override def call(a: Int, b: Int, c: Int = 6): Int = 
{ c*(a + b) } }}}

I would prefer the following to give the expected output of 18:

{{from pyspark.conf import SparkConf from pyspark.sql import SparkSession from 
pyspark.sql.types import IntType spark_conf = SparkConf().setAll([ 
('spark.jars', 'myUDFs-assembly-0.1.1.jar') ]) spark = 
SparkSession.builder.appName('my_app').config(conf = 
spark_conf).enableHiveSupport().getOrCreate() 
spark.udf.registerJavaFunction("my_udf", "myUDFs.my_udf", IntType())}}

{{spark.sql('select my_udf(1, 2)').collect()}}

But it seems this is currently impossible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to