[jira] [Commented] (SPARK-27681) Use scala.collection.Seq explicitly instead of scala.Seq alias

Marcelo Vanzin (JIRA) Mon, 13 May 2019 11:06:05 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-27681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16838766#comment-16838766
 ]


Marcelo Vanzin commented on SPARK-27681:
----------------------------------------

bq. But, user code that passes a non-immutable Seq to a Spark API that accepts 
scala.Seq now will no longer compile in 2.13.

That's true, but isn't that what the Scala developers intend anyway? That will 
be true for any Scala code, not just Spark. That's assuming they didn't add any 
implicit conversion from a mutable Seq to an immutable one, which would solve 
that problem.

My problem with your suggestion is that now Spark developers will have to 
always remember to import that different {{Seq}} type. And if they don't 
remember, probably nothing will break until it's too late to notice. It's a 
counter-intuitive change for developers, and I in particular am not seeing a 
lot of benefits from it.

Here's an example of your proposed change that would break user code (just ran 
it on Scala 2.13-RC1):

{code}
scala> def foo(): scala.collection.Seq[String] = Nil
foo: ()scala.collection.Seq[String]

scala> val s: Seq[String] = foo()
                               ^
       error: type mismatch;
        found   : Seq[String] (in scala.collection)
        required: Seq[String] (in scala.collection.immutable)
{code}

So, aside from Spark developers having to remember to use the different {{Seq}} 
type, user code might also have to change so that their internal APIs also use 
the different type, or things like the above may occur.

BTW I also checked and there's no automatic promotion from mutable to immutable 
seqs:

{code}
scala> val s: Seq[String] = scala.collection.mutable.ArrayBuffer[String]()
                                                                        ^
       error: type mismatch;
        found   : scala.collection.mutable.ArrayBuffer[String]
        required: Seq[String]
{code}

So I sort of understand your desire to keep things more similar, but I'm not 
really seeing the advantages you see.

> Use scala.collection.Seq explicitly instead of scala.Seq alias
> --------------------------------------------------------------
>
>                 Key: SPARK-27681
>                 URL: https://issues.apache.org/jira/browse/SPARK-27681
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML, MLlib, Spark Core, SQL, Structured Streaming
>    Affects Versions: 3.0.0
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>            Priority: Major
>
> {{scala.Seq}} is widely used in the code, and is an alias for 
> {{scala.collection.Seq}} in Scala 2.12. It will become an alias for 
> {{scala.collection.immutable.Seq}} in Scala 2.13. In many cases, this will be 
> fine, as Spark users using Scala 2.13 will also have this changed alias. In 
> some cases it may be undesirable, as it will cause some code to compile in 
> 2.12 but not in 2.13. In some cases, making the type {{scala.collection.Seq}} 
> explicit so that it doesn't vary can help avoid this, so that Spark apps 
> might cross-compile for 2.12 and 2.13 with the same source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-27681) Use scala.collection.Seq explicitly instead of scala.Seq alias

Reply via email to