LuciferYang opened a new pull request, #43670:
URL: https://github.com/apache/spark/pull/43670

   ### What changes were proposed in this pull request?
   This is pr change to explicitly convert `Array` to `Seq` when function input 
is defined as `Seq `to avoid compilation warnings as like follwos:
   
   ```
   [error] 
/Users/yangjie01/SourceCode/git/spark-mine-sbt/mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala:57:31:
 method copyArrayToImmutableIndexedSeq in class LowPriorityImplicits2 is 
deprecated (since 2.13.0): implicit conversions from Array to 
immutable.IndexedSeq are implemented by copying; use `toIndexedSeq` explicitly 
if you want to copy, or use the more efficient non-copying 
ArraySeq.unsafeWrapArray
   [error] Applicable -Wconf / @nowarn filters for this fatal warning: 
msg=<part of the message>, cat=deprecation, 
site=org.apache.spark.ml.linalg.Vector.equals, 
origin=scala.LowPriorityImplicits2.copyArrayToImmutableIndexedSeq, 
version=2.13.0
   [error]             Vectors.equals(s1.indices, s1.values, s2.indices, 
s2.values)
   [error]                               ^
   ```
   
   There are mainly four ways to fix it:
   - `tools` and `mllib-local` module: Since the `tools` and `mllib-local` 
module does not import the `common-utils` module, 
`scala.collection.immutable.ArraySeq.unsafeWrapArray` is used directly.
   - `examples` module: Since `ArrayImplicits` is an internal tool class in 
Spark, `scala.collection.immutable.ArraySeq.unsafeWrapArray` is used directly.
   - Introduce a helper function for `QueryTest` that accept the `Array` type 
`expectedAnswer`
   - Other modules: By importing `ArrayImplicits` and calling 
`toImmutableArraySeq`, the `Array` is wrapped into `immutable.ArraySeq`.
   
   ### Why are the changes needed?
   Clean up deprecated Scala Api usage.
   
   Why use `ArraySeq.unsafeWrapArray` instead of `toIndexedSeq`:
   
   1. `ArraySeq.unsafeWrapArray` saves the overhead of collection copying 
compared to `toIndexedSeq`, it has less memory overhead and certain performance 
advantages. Moreover, `ArraySeq.unsafeWrapArray` is faster in scenarios such as 
     - `Array.fill.toImmutableArraySeq` versus `IndexedSeq.fill`
     - `Array.apply(data).toImmutableArraySeq` versus `IndexedSeq.apply(data)`
     - `Array.emptyXXArray.toImmutableArraySeq` versus `IndexedSeq.empty`.
   
   2. In Scala 2.12, when the function is defined as 
   
   ```
   def func(input: Seq[T]): R = {
        ...
   }
   ```
   
   if an `Array` type data array is used as the function input, it will be 
implicitly converted by default through the `scala.Predef#genericArrayOps` 
function, the specific implementation is as follows:
   
   ```scala
     implicit def genericArrayOps[T](xs: Array[T]): ArrayOps[T] = (xs match {
       case x: Array[AnyRef]  => refArrayOps[AnyRef](x)
       case x: Array[Boolean] => booleanArrayOps(x)
       case x: Array[Byte]    => byteArrayOps(x)
       case x: Array[Char]    => charArrayOps(x)
       case x: Array[Double]  => doubleArrayOps(x)
       case x: Array[Float]   => floatArrayOps(x)
       case x: Array[Int]     => int(x    case x: Array[Long]    => 
longArrayOps(x)
       case x: Array[Short]   => shortArrayOps(x)
       case x: Array[Unit]    => unitArrayOps(x)
       case null              => null
     }).asInstanceOf[ArrayOps[T]]
   
     implicit def booleanArrayOps(xs: Array[Boolean]): ArrayOps.ofBoolean   = 
new ArrayOps.ofBoolean(xs)
     implicit def byteArrayOps(xs: Array[Byte]): ArrayOps.ofByte            = 
new ArrayOps.ofByte(xs)
     implicit def charArrayOps(xs: Array[Char]): ArrayOps.ofChar            = 
new ArrayOps.ofChar(xs)
     implicit def doubleArrayOps(xs: Array[Double]): ArrayOps.ofDouble      = 
new ArrayOps.ofDouble(xs)
     implicit def floatArrayOps(xs: Array[Float]): ArrayOps.ofFloat         = 
new ArrayOps.ofFloat(xs)
     implicit def intArrayOps(xs: Array[Int]): ArrayOps.ofInt               = 
new ArrayOps.ofInt(xs  implicit def longArrayOps(xs: Array[Long]): 
ArrayOps.ofLong            = new ArrayOps.ofLong(xs)
     implicit def refArrayOps[T <: AnyRef](xs: Array[T]): ArrayOps.ofRef[T] = 
new ArrayOps.ofRef[T](xs)
     implicit def shortArrayOps(xs: Array[Short]): ArrayOps.ofShort         = 
new ArrayOps.ofShort(xs)
     implicit def unitArrayOps(xs: Array[Unit]): ArrayOps.ofUnit            = 
new ArrayOps.ofUnit(xs)
   ```
   
   This implicit conversion will wrap the input data into a 
`mutable.WrappedArray`, for example for Array[Int] type data, it will be 
wrapped into `mutable.WrappedArray.ofInt`:
   
   ```scala
     final class ofInt(override val repr: Array[Int]) extends AnyVal with 
ArrayOps[Int] with ArrayLike[Int, Array[Int]] {
   
       override protected[this] def thisCollection: WrappedArray[Int] = new 
WrappedArray.ofInt(repr)
       override protected[this] def toCollection(repr: Array[Int]): 
WrappedArray[Int] = new WrappedArray.ofInt(repr)
       override protected[this] def newBuilder = new ArrayBuilder.ofInt
   
       def length: Int = repr.length
       def apply(index: Int): Int = repr(index)
       def update(index: Int, elem: Int) { repr(index) = elem }
     }
   
     final class ofInt(val array: Array[Int]) extends WrappedArray[Int] with 
Serializable {
       def elemTag = ClassTag.Int
       def length: Int = array.length
       def apply(index: Int): Int = array(index)
       def update(index: Int, elem: Int) { array(index) = elem }
       override def hashCode = MurmurHash3.wrappedArrayHash(array)
       override def equals(that: Any) = that match {
         case that: ofInt => Arrays.equals(array, that.array)
         case _ => super.equals(that)
       }
     }
   ```
   
   As we can see, in Scala 2.12, Array type input will be implicitly converted 
into a `mutable.WrappedArray`, and no collection copying is performed.
   
   In Scala 2.13, although the default implicit type conversion will perform a 
defensive collection copy, but based on the facts that existed when Spark using 
Scala 2.12, we can assume that it is still safe to explicitly wrap Array type 
input into an `immutable.ArraySeq` without collection copying in Scala 2.13.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Pass GitHub Actions
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to