This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 7ca355cbc225 [SPARK-46970][CORE] Rewrite `OpenHashSet#hasher` with `pattern matching` 7ca355cbc225 is described below commit 7ca355cbc225653b090020271117a763ec59536d Author: yangjie01 <yangji...@baidu.com> AuthorDate: Sat Feb 3 21:07:16 2024 -0800 [SPARK-46970][CORE] Rewrite `OpenHashSet#hasher` with `pattern matching` ### What changes were proposed in this pull request? The proposed changes in this pr involve refactoring the method of creating a `Hasher[T]` instance in the code. The original code used a series of if-else statements to check the class type of `T` and create the corresponding `Hasher[T]` instance. The proposed change simplifies this process by using Scala's pattern matching feature. The new code is more concise and easier to read. ### Why are the changes needed? The changes are needed for several reasons. Firstly, the use of pattern matching makes the code more idiomatic to Scala, which is beneficial for readability and maintainability. Secondly, the original code contains a comment about a bug in the Scala 2.9.x compiler that prevented the use of pattern matching in this context. However, Apache Spark 4.0 has switched to using Scala 2.13, and the new code has passed all tests, it appears that the bug no longer exists in the new version of Sc [...] ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes #44998 from LuciferYang/openhashset-hasher. Lead-authored-by: yangjie01 <yangji...@baidu.com> Co-authored-by: YangJie <yangji...@baidu.com> Signed-off-by: Dongjoon Hyun <dh...@apple.com> --- .../apache/spark/util/collection/OpenHashSet.scala | 28 +++++----------------- 1 file changed, 6 insertions(+), 22 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala b/core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala index 6815e47a198d..faee9ce56a0a 100644 --- a/core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala +++ b/core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala @@ -62,28 +62,12 @@ class OpenHashSet[@specialized(Long, Int, Double, Float) T: ClassTag]( // specialization to work (specialized class extends the non-specialized one and needs access // to the "private" variables). - protected val hasher: Hasher[T] = { - // It would've been more natural to write the following using pattern matching. But Scala 2.9.x - // compiler has a bug when specialization is used together with this pattern matching, and - // throws: - // scala.tools.nsc.symtab.Types$TypeError: type mismatch; - // found : scala.reflect.AnyValManifest[Long] - // required: scala.reflect.ClassTag[Int] - // at scala.tools.nsc.typechecker.Contexts$Context.error(Contexts.scala:298) - // at scala.tools.nsc.typechecker.Infer$Inferencer.error(Infer.scala:207) - // ... - val mt = classTag[T] - if (mt == ClassTag.Long) { - (new LongHasher).asInstanceOf[Hasher[T]] - } else if (mt == ClassTag.Int) { - (new IntHasher).asInstanceOf[Hasher[T]] - } else if (mt == ClassTag.Double) { - (new DoubleHasher).asInstanceOf[Hasher[T]] - } else if (mt == ClassTag.Float) { - (new FloatHasher).asInstanceOf[Hasher[T]] - } else { - new Hasher[T] - } + protected val hasher: Hasher[T] = classTag[T] match { + case ClassTag.Long => new LongHasher().asInstanceOf[Hasher[T]] + case ClassTag.Int => new IntHasher().asInstanceOf[Hasher[T]] + case ClassTag.Double => new DoubleHasher().asInstanceOf[Hasher[T]] + case ClassTag.Float => new FloatHasher().asInstanceOf[Hasher[T]] + case _ => new Hasher[T] } protected var _capacity = nextPowerOf2(initialCapacity) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org