(spark) branch master updated: [SPARK-46970][CORE] Rewrite `OpenHashSet#hasher` with `pattern matching`

dongjoon Sat, 03 Feb 2024 21:07:33 -0800

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 7ca355cbc225 [SPARK-46970][CORE] Rewrite `OpenHashSet#hasher` with 
`pattern matching`
7ca355cbc225 is described below

commit 7ca355cbc225653b090020271117a763ec59536d
Author: yangjie01 <yangji...@baidu.com>
AuthorDate: Sat Feb 3 21:07:16 2024 -0800

    [SPARK-46970][CORE] Rewrite `OpenHashSet#hasher` with `pattern matching`
    
    ### What changes were proposed in this pull request?
    The proposed changes in this pr involve refactoring the method of creating 
a `Hasher[T]` instance in the code. The original code used a series of if-else 
statements to check the class type of `T` and create the corresponding 
`Hasher[T]` instance. The proposed change simplifies this process by using 
Scala's pattern matching feature. The new code is more concise and easier to 
read.
    
    ### Why are the changes needed?
    The changes are needed for several reasons. Firstly, the use of pattern 
matching makes the code more idiomatic to Scala, which is beneficial for 
readability and maintainability. Secondly, the original code contains a comment 
about a bug in the Scala 2.9.x compiler that prevented the use of pattern 
matching in this context. However, Apache Spark 4.0 has switched to using Scala 
2.13, and the new code has passed all tests, it appears that the bug no longer 
exists in the new version of Sc [...]
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    Pass GitHub Actions
    
    ### Was this patch authored or co-authored using generative AI tooling?
    No
    
    Closes #44998 from LuciferYang/openhashset-hasher.
    
    Lead-authored-by: yangjie01 <yangji...@baidu.com>
    Co-authored-by: YangJie <yangji...@baidu.com>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 .../apache/spark/util/collection/OpenHashSet.scala | 28 +++++-----------------
 1 file changed, 6 insertions(+), 22 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala 
b/core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala
index 6815e47a198d..faee9ce56a0a 100644
--- a/core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala
+++ b/core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala
@@ -62,28 +62,12 @@ class OpenHashSet[@specialized(Long, Int, Double, Float) T: 
ClassTag](
   // specialization to work (specialized class extends the non-specialized one 
and needs access
   // to the "private" variables).
 
-  protected val hasher: Hasher[T] = {
-    // It would've been more natural to write the following using pattern 
matching. But Scala 2.9.x
-    // compiler has a bug when specialization is used together with this 
pattern matching, and
-    // throws:
-    // scala.tools.nsc.symtab.Types$TypeError: type mismatch;
-    //  found   : scala.reflect.AnyValManifest[Long]
-    //  required: scala.reflect.ClassTag[Int]
-    //         at 
scala.tools.nsc.typechecker.Contexts$Context.error(Contexts.scala:298)
-    //         at 
scala.tools.nsc.typechecker.Infer$Inferencer.error(Infer.scala:207)
-    //         ...
-    val mt = classTag[T]
-    if (mt == ClassTag.Long) {
-      (new LongHasher).asInstanceOf[Hasher[T]]
-    } else if (mt == ClassTag.Int) {
-      (new IntHasher).asInstanceOf[Hasher[T]]
-    } else if (mt == ClassTag.Double) {
-      (new DoubleHasher).asInstanceOf[Hasher[T]]
-    } else if (mt == ClassTag.Float) {
-      (new FloatHasher).asInstanceOf[Hasher[T]]
-    } else {
-      new Hasher[T]
-    }
+  protected val hasher: Hasher[T] = classTag[T] match {
+    case ClassTag.Long => new LongHasher().asInstanceOf[Hasher[T]]
+    case ClassTag.Int => new IntHasher().asInstanceOf[Hasher[T]]
+    case ClassTag.Double => new DoubleHasher().asInstanceOf[Hasher[T]]
+    case ClassTag.Float => new FloatHasher().asInstanceOf[Hasher[T]]
+    case _ => new Hasher[T]
   }
 
   protected var _capacity = nextPowerOf2(initialCapacity)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-46970][CORE] Rewrite `OpenHashSet#hasher` with `pattern matching`

Reply via email to