LuciferYang opened a new pull request, #37353: URL: https://github.com/apache/spark/pull/37353
### What changes were proposed in this pull request? `Utils.getIteratorSize` is slightly slower than `Iterator.size` when using Scala 2.13, but it seems that we can't call `Iterator.size` directly, because `Utils.getIteratorSize` method returns `Long` type data, using `Iterator.size` directly can't ensure that overflow will not occur. So this pr adds `if (iterator.knownSize > 0)` conditional branches to optimize this method when using Scala 2.13, this optimization way refers to `IterableOnceOps.size`: https://github.com/scala/scala/blob/cbc012e573346dc685c478eec5fcbb56d22ea884/src/library/scala/collection/IterableOnce.scala#L835-L849 <img width="553" alt="image" src="https://user-images.githubusercontent.com/1475305/182060742-c56ead62-4164-4633-a501-48e2e3151752.png"> ### Why are the changes needed? Optimize `Utils.getIteratorSize` for Scala 2.13. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? - Pass GitHub Actions -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
