[GitHub] [spark] LuciferYang commented on pull request #37185: [SPARK-39766][SQL] Ensure `arrayOfAnyAsSeq` in `GenericArrayDataBenchmark` has same performance when using Scala 2.12 and 2.13

GitBox Sat, 16 Jul 2022 11:15:58 -0700


LuciferYang commented on PR #37185:
URL: https://github.com/apache/spark/pull/37185#issuecomment-1186257184


   @srowen 
   
   Summarized as follows：
   
   - Risk：The addition or deletion operation of `ArraySeq(both mutable and 
immutable)` will return a new `ArraySeq` and  the underlying array cannot be 
grown in place. But there is also a risk: `mutable.ArraySeq.sortInPlace` will 
sort on the original array in place,  which is a new method in Scala 2.13 and 
Spark doesn't use it.
   
   - Performance: The throughput of `arrayOfAnyAsSeq` scenario is `41.4M/s` 
before this pr, and its throughput is `2491.6M/s` after this pr. The 
performance data using Scala 2.12 are:  `395.1M/s(Java 8)`, `1845.1M/s(Java 
11)` and `1245.5M/s (Java 17)`
   
   - Whether `case array: Array[_] => array.toSeq.toArray[Any]` works： This is 
a TODO of mine, let me track it and if this is an issue, I will file a new Jira 
and try to solve it. 
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] LuciferYang commented on pull request #37185: [SPARK-39766][SQL] Ensure `arrayOfAnyAsSeq` in `GenericArrayDataBenchmark` has same performance when using Scala 2.12 and 2.13

Reply via email to