[
https://issues.apache.org/jira/browse/SPARK-12844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15466916#comment-15466916
]
Jagadeesan A S commented on SPARK-12844:
----------------------------------------
The algebraic properties have already been taken care by this
[commit|https://github.com/apache/spark/commit/fb7e21797ed618d9754545a44f8f95f75b66757a]
on [SPARK-13339|https://issues.apache.org/jira/browse/SPARK-13339]
cc [~srowen]
> Spark documentation should be more precise about the algebraic properties of
> functions in various transformations
> -----------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-12844
> URL: https://issues.apache.org/jira/browse/SPARK-12844
> Project: Spark
> Issue Type: Documentation
> Components: Documentation
> Reporter: Jimmy Lin
> Priority: Minor
>
> Spark documentation should be more precise about the algebraic properties of
> functions in various transformations. The way the current documentation is
> written is potentially confusing. For example, in Spark 1.6, the scaladoc for
> reduce in RDD says:
> > Reduces the elements of this RDD using the specified commutative and
> > associative binary operator.
> This is precise and accurate. In the documentation of reduceByKey in
> PairRDDFunctions, on the other hand, it says:
> > Merge the values for each key using an associative reduce function.
> To be more precise, this function must also be commutative in order for the
> computation to be correct. Writing commutative for reduce and not reduceByKey
> gives the false impression that the function in the latter does not need to
> be commutative.
> The same applies to aggregateByKey. To be precise, both seqOp and combOp need
> to be associative (mentioned) AND commutative (not mentioned) in order for
> the computation to be correct. It would be desirable to fix these
> inconsistencies throughout the documentation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]