[ 
https://issues.apache.org/jira/browse/SPARK-25976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16680870#comment-16680870
 ] 

Yuval Yaari commented on SPARK-25976:
-------------------------------------

Hi. Thanks for your answer. 

Fold with an initial value is not the same as giving reduce a default value 
when the rdd is empty. When the rdd is not empty the init value is effecting 
the fold operatiom which is sometime not desired.

For example when the end user has an option to aggregate based on max, min, sum 
or avg the init value should be different for each one which makes the code 
cumbersome. There are other use cases where if the rdd is empty the user would 
want some empty (no data) value. This cannot be done by fold  

Please re open i would like to contribute

> Allow rdd.reduce on empty rdd by returning an Option[T]
> -------------------------------------------------------
>
>                 Key: SPARK-25976
>                 URL: https://issues.apache.org/jira/browse/SPARK-25976
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.3.2
>            Reporter: Yuval Yaari
>            Priority: Minor
>
> it is sometimes useful to let the user decide what value to return when 
> reducing on an empty rdd.
> currently, if there is no data to reduce an UnsupportedOperationException is 
> thrown. 
> although user can catch that exception, it seems like a "shaky" solution as 
> UnsupportedOperationException might be thrown from a different location.
> Instead, we can overload the reduce method by adding add a new method:
> reduce(f: (T, T) => T, defaultIfEmpty: () => T): T
> the reduce API will not be effected as it will simply call the second reduce 
> method throwing an UnsupportedException as the default value
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to