eejbyfeldt commented on PR #38427:
URL: https://github.com/apache/spark/pull/38427#issuecomment-1296802435

   > > 
   > 
   > A little personal thought:the limitation of `immutable.Seq` is caused by 
Scala version migration, this comes from Scala 2.13, which changes the 
predefined of `Seq` from `scala.collection.Seq(2.12)` to 
`scala.collection.immutable.Seq`, we only promise that it is a 
`scala.collection.Seq` when using Scala 2.12 now and 2.13 is beta version, so 
if we agree to explicitly define `Seq` as `scala.collection.Seq` in the API, 
the things may become simple(can avoid calling `toSeq` or `toIndexedSeq`).
   
   The reason I changed it to `IndexedSeq` is that `mutable.ArrayBuffer` is an 
`IndexedSeq` on 2.12 so you can do this without copying:
   ```
   scala> val a: IndexedSeq[Int] =  scala.collection.mutable.ArrayBuffer[Int]()
   a: IndexedSeq[Int] = ArrayBuffer()
   ```
   so I assumed that `toIndexedSeq` would be a no-op like `toSeq` in 2.12. But 
this seems to be wrong.
   
   But I agree that if we change `Seq` to `scala.collection.Seq` or 
`IndexedSeq` to `scala.collection.IndexedSeq` it would be possible to just 
remove all the `toSeq` or `toIndexedSeq` (except one place for toIndexSeq where 
we currently use a ListBuffer but that could be changed to a ArrayBuffer) and 
get the performance on 2.13 without getting worse on 2.12.
   
   
   > Give more information for this way: convert ArrayBuffer to immutable.Seq 
or immutable.IndexedSeq, we should use immutable.ArraySeq.from(buffer) or 
immutable.ArraySeq.unsafeWrapArray(buffer.toArray) instead of buffer.toSeq( or 
buffer.toIndexedSeq), from and unsafeWrapArray will be more efficient.
   
   >  At the same time, there are many other similar cases. Should we fixed 
them all in one
   
   So the idea here would be to define some helper that can be used to convert 
our mutable collection to a `Seq` and  have a no-op operation on scala 2.12  
and do it using an ArraySeq on 2.13. And therefore accepting an extra copy on 
2.13?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to