[ 
https://issues.apache.org/jira/browse/SPARK-25976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16680938#comment-16680938
 ] 

Yuval Yaari commented on SPARK-25976:
-------------------------------------

correct, however in scala there is not much performance penalty on asking 
isEmpty.

i suggest:
{code:java}
> sc.emptyRDD[Double]().reduce(_ + _)  
java.lang.UnsupportedOperationException("empty collection")
> sc.emptyRDD[Double]().reduce(_ + _, () => Double.NaN)
Double.NaN
{code}
 

> Allow rdd.reduce on empty rdd by returning an Option[T]
> -------------------------------------------------------
>
>                 Key: SPARK-25976
>                 URL: https://issues.apache.org/jira/browse/SPARK-25976
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.3.2
>            Reporter: Yuval Yaari
>            Priority: Minor
>
> it is sometimes useful to let the user decide what value to return when 
> reducing on an empty rdd.
> currently, if there is no data to reduce an UnsupportedOperationException is 
> thrown. 
> although user can catch that exception, it seems like a "shaky" solution as 
> UnsupportedOperationException might be thrown from a different location.
> Instead, we can overload the reduce method by adding add a new method:
> reduce(f: (T, T) => T, defaultIfEmpty: () => T): T
> the reduce API will not be effected as it will simply call the second reduce 
> method throwing an UnsupportedException as the default value
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to