[
https://issues.apache.org/jira/browse/SPARK-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375290#comment-15375290
]
Sean Owen commented on SPARK-16325:
-----------------------------------
I can't reproduce this on master. What error do you get -- where does the error
come from? reduceByKey does not require an Ordering, itself.
> reduceByKey requires an implicit ordering which it never uses
> -------------------------------------------------------------
>
> Key: SPARK-16325
> URL: https://issues.apache.org/jira/browse/SPARK-16325
> Project: Spark
> Issue Type: Bug
> Reporter: Tofigh
> Priority: Minor
>
> assume there is a case class as follows:
> {code}
> case class UnorderedPair[A](left: A, right: A) extends Serializable {
> override def equals(obj: Any): Boolean = obj match {
> case other: UnorderedPair[A] => (this.left == other.left && this.right ==
> other.right) || (this.left == other.right && this.right == other.left)
> case _ => false
> }
> override def hashCode(): Int = left.hashCode() * right.hashCode()
> def toSeq(): Seq[A] = Seq(left, right)
> }
> {code}
> and assume an RDD of UnorderedPair and Seq(Long):
> {code}
> val rdd = sc.parallelize(Seq( (UnorderedPair(12,14), Seq(123L)),
> (UnorderedPair(12,14), Seq(123L)) ))
> {code}
> then the following code:
> {code}
> rdd.reduceByKey(_ ++ _ )
> {code}
> throws an error that an implicit Ordering is required.
> The dummy solution was to rewrite the case class as follows:
> {code}
> case class UnorderedPair[A](left: A, right: A) extends
> Ordered[UnorderedPair[A]] with Serializable {
> override def equals(obj: Any): Boolean = obj match {
> case other: UnorderedPair[A] => (this.left == other.left && this.right ==
> other.right) || (this.left == other.right && this.right == other.left)
> case _ => false
> }
> override def hashCode(): Int = left.hashCode() * right.hashCode()
> def toSeq(): Seq[A] = Seq(left, right)
> override def compare(that: UnorderedPair[A]): Int = throw new Defect("This
> function should not be called. It is a workaround for a Spark bug in
> reduceByKey which requires an Ordering function.")
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]