[ 
https://issues.apache.org/jira/browse/SPARK-11725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15134322#comment-15134322
 ] 

Cristian Opris commented on SPARK-11725:
----------------------------------------

I would argue this is a rather inappropriate solution. Scala does not normally 
distinguish between primitive and boxed types, with Long for example 
representing both. 

So having Long args in a function and then testing for null is a valid thing to 
do in Scala.

This is made worse by the fact that the behaviour is actually not documented 
anywhere, so results in very strange and unexpected behaviour.

By the principle of least surprise, I would suggest either (or both) of:

- Throwing an error when a null value cannot be passed to a UDF that has been 
compiled to only accept nulls.
- Using Option[T] as a UDF arg to signal that the function accepts nulls.






> Let UDF to handle null value
> ----------------------------
>
>                 Key: SPARK-11725
>                 URL: https://issues.apache.org/jira/browse/SPARK-11725
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Jeff Zhang
>            Assignee: Wenchen Fan
>            Priority: Blocker
>              Labels: releasenotes
>             Fix For: 1.6.0
>
>
> I notice that currently spark will take the long field as -1 if it is null.
> Here's the sample code.
> {code}
> sqlContext.udf.register("f", (x:Int)=>x+1)
> df.withColumn("age2", expr("f(age)")).show()
> //////////////// Output ///////////////////////
> +----+-------+----+
> | age|   name|age2|
> +----+-------+----+
> |null|Michael|   0|
> |  30|   Andy|  31|
> |  19| Justin|  20|
> +----+-------+----+
> {code}
> I think for the null value we have 3 options
> * Use a special value to represent it (what spark does now)
> * Always return null if the udf input has null value argument 
> * Let udf itself to handle null
> I would prefer the third option 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to