Josh Rosen created SPARK-9785:
---------------------------------
Summary: HashPartitioning guarantees / compatibleWith violate
those methods' contracts
Key: SPARK-9785
URL: https://issues.apache.org/jira/browse/SPARK-9785
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 1.5.0
Reporter: Josh Rosen
Assignee: Josh Rosen
Priority: Blocker
HashPartitioning compatibility is defined w.r.t the _set_ of expressions, but
in other contexts the ordering of those expressions matters. This is
illustrated by the following regression test:
{code}
test("HashPartitioning compatibility") {
val expressions = Seq(Literal(2), Literal(3))
// Consider two HashPartitionings that have the same _set_ of hash
expressions but which are
// created with different orderings of those expressions:
val partitioningA = HashPartitioning(expressions, 100)
val partitioningB = HashPartitioning(expressions.reverse, 100)
// These partitionings are not considered equal:
assert(partitioningA != partitioningB)
// However, they both satisfy the same clustered distribution:
val distribution = ClusteredDistribution(expressions)
assert(partitioningA.satisfies(distribution))
assert(partitioningB.satisfies(distribution))
// Both partitionings are compatible with and guarantee each other:
assert(partitioningA.compatibleWith(partitioningB))
assert(partitioningB.compatibleWith(partitioningA))
assert(partitioningA.guarantees(partitioningB))
assert(partitioningB.guarantees(partitioningA))
// Given all of this, we would expect these partitionings to compute the
same hashcode for
// any given row:
def computeHashCode(partitioning: HashPartitioning): Int = {
val hashExprProj = new
InterpretedMutableProjection(partitioning.expressions, Seq.empty)
hashExprProj.apply(InternalRow.empty).hashCode()
}
assert(computeHashCode(partitioningA) === computeHashCode(partitioningB))
}
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]