eejbyfeldt commented on PR #38427:
URL: https://github.com/apache/spark/pull/38427#issuecomment-1300650411

   > OK, I think we can't accept that much perf degradation. If there's a 
simple way to refactor the code to make both faster, that seems OK. Ideally we 
avoid separate code branches for 2.12 vs 2.13, unless it's simple and important 
here
   
   I think the two options that have been discussed are either.
   
   Separate code branches for 2.12 and 2.13 converting the mutable collections 
to `Seq`  for 2.12 it would just be a no-op since `Seq` is alias for 
`scala.collection.Seq`. For 2.13 we would copy the data to a `ArraySeq` since 
in 2.13 `Seq` is an alias for `scala.collection.immutable.Seq`. The gain here 
is I think that when we are on 2.13 we use an immutable collection instead of 
`scala.collection.Seq` which might point to a mutable collection.
   
   The other option would be to just change the code to explicitly use 
`scala.collection.Seq` (using scala.collection.IndexedSeq would also be an 
option) instead of `Seq` and removing the explicit calls `toSeq` then it would 
have the same meaning and performance as the current 2.12 code.
   
   @srowen Which approach do think is preferable? 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to