[
https://issues.apache.org/jira/browse/FLINK-13414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257763#comment-17257763
]
Nick Burkard edited comment on FLINK-13414 at 1/3/21, 2:16 PM:
---------------------------------------------------------------
Hi everyone,
I spent the past few days working on this, building on [~ariskoliopoulos]'s
work - there's a lot of work that will be needed to properly migrate to
supporting 2.13. I was using [this
document|https://docs.scala-lang.org/overviews/core/collections-migration-213.html]
provided by Scala's website, plus their [compat
library|https://github.com/scala/scala-collection-compat], to handle the
migration steps. I wanted to get the community's opinion on a few different
points. I think the work can be split into a few different tickets to limit
overall impact.
#1 - Scala 2.11 is very old, most modern Scala libraries do not support it
anymore.
The typical support path is 2.12 and 2.13. Supporting 2.11 while also
supporting 2.13 will make this migration process more difficult. If there's
consensus to drop 2.11 support, perhaps that should come as part of its own
ticket first. This will make upgrading libraries to support 2.13 much easier as
well. On the subject of 2.12, I noticed `flink-scala-shell` only supports 2.11
and I am unaware if there's interests to continue upgrading it. Wanted to
mention it in case anyone else has insight.
#2 - Flink makes extensive use of implicit type conversions.
A lot of Flink's Scala code, especially in the flink-table submodules, make
repeated use of implicit conversions to convert between Java & Scala types. I
mention this because `import scala.collection.JavaConversions.__`_ is entirely
gone in 2.13, which is what provides said implicit conversions. We can instead
start to import scala.collection.convert.ImplicitConversions._, which is still
present (albeit deprecated) in 2.13. Worth noting this new import is not
present in 2.11, it would need to come after point #1. Again, I think this
could be its own ticket so as to limit the impact.
#3 - `TraversableOnce` is gone in 2.13.
Internal Java code uses `TraversableOnce` for
[two|https://github.com/apache/flink/blob/master/flink-scala/src/main/java/org/apache/flink/api/scala/typeutils/TraversableSerializerConfigSnapshot.java]
[serializers|https://github.com/apache/flink/blob/master/flink-scala/src/main/java/org/apache/flink/api/scala/typeutils/TraversableSerializerSnapshot.java],
with comments explicitly stating neither cannot be written in Scala due to
constructor limitations with the language. This poses a problem, since as far
as I can tell we can't import the Scala compat library to get access to
`IterableOnce` in Java code for cross-compiling. We may have to move these two
files into separate java folders that maven conditionally compiles based on
Scala version - one for `TraversableOnce` with 2.12 and one for `IterableOnce`
with 2.13. I was able to get it compiling with this strategy but am open to
other suggestions.
#4 - Collection conversion has changed.
The collection compat library will mitigate the difficulty in making this
change, specifically from using `collection.to[Type]` to using
`collection.to(Type)`. Probably safe to also include the change for switching
from `mutable.MutableList` to `mutable.ListBuffer` here since its surface area
is not as user-facing.
#5 - Code generation has changed.
Like previously mentioned, the migration from `CanBuildFrom` to `Factory` /
`BuildFrom` will be necessary, but shouldn't be large, since it's contained to
the module `flink-scala` from what I could tell.
#6 - `Seq` in 2.12 and 2.13 mean different things.
2.12 refers to the mutable version, while 2.13 refers to the immutable
version. This is also present in varargs Scala APIs, such as
`env.fromCollection("one", "two")`. This poses a problem, as `Seq` is used in
each of the Scala modules and is quite prevalent in user-facing APIs. Although
the ambiguity for varargs between versions can't be avoided, we could choose to
disambiguate typical `Seq` use for 2.12 and 2.13 by enforcing consistent usage.
This is likely where most of the work will reside depending on the strategy
chosen. The aforementioned migration doc provides a few options, such as
internally shadowing `Seq` to make unqualified use illegal (i.e. only allow
`immutable.Seq` or `collection.Seq`, not `Seq`). I did notice some issues with
the type serializer though with using qualified versions (`immutable.Seq`), so
perhaps we want to consider shadowing `Seq` to always mean one version? Either
way, this is a large change and should be discussed early.
Overall, the changes for points 1-5 are relatively easy when separated, while
#6 will likely be a larger change considering how much of the blink table code
is written in Scala. Open to any and all suggestions. :)
was (Author: nickburkard):
Hi everyone,
I spent the past few days working on this, building on [~ariskoliopoulos]'s
work - there's a lot of work that will be needed to properly migrate to
supporting 2.13. I was using [this
document|https://docs.scala-lang.org/overviews/core/collections-migration-213.html]
provided by Scala's website, plus their [compat
library|https://github.com/scala/scala-collection-compat], to handle the
migration steps. I wanted to get the community's opinion on a few different
points. I think the work can be split into a few different tickets to limit
overall impact.
#1 - Scala 2.11 is very old, most modern Scala libraries do not support it
anymore.
The typical support path is 2.12 and 2.13. Supporting 2.11 while also
supporting 2.13 will make this migration process more difficult. If there's
consensus to drop 2.11 support, perhaps that should come as part of its own
ticket first. This will make upgrading libraries to support 2.13 much easier as
well. On the subject of 2.12, I noticed `flink-scala-shell` only supports 2.11
and I am unaware if there's interests to continue upgrading it. Wanted to
mention it in case anyone else has insight.
#2 - Flink makes extensive use of implicit type conversions.
A lot of Flink's Scala code, especially in the flink-table submodules, make
repeated use of implicit conversions to convert between Java & Scala types. I
mention this because `import scala.collection.JavaConversions.__`_ is entirely
gone in 2.13, which is what provides said implicit conversions. We can instead
start to import scala.collection.convert.ImplicitConversions._, which is still
present (albeit deprecated) in 2.13. Worth noting this new import is not
present in 2.11, it would need to come after point #1. Again, I think this
could be its own ticket so as to limit the impact.
#3 - `TraversableOnce` is gone in 2.13.
Internal Java code uses `TraversableOnce` for
[two|https://github.com/apache/flink/blob/master/flink-scala/src/main/java/org/apache/flink/api/scala/typeutils/TraversableSerializerConfigSnapshot.java]
[serializers|https://github.com/apache/flink/blob/master/flink-scala/src/main/java/org/apache/flink/api/scala/typeutils/TraversableSerializerSnapshot.java],
with comments explicitly stating neither cannot be written in Scala due to
constructor limitations with the language. This poses a problem, since as far
as I can tell we can't import the Scala compat library to get access to
`IterableOnce` in Java code for cross-compiling. We may have to move these two
files into separate java folders that maven conditionally compiles based on
Scala version - one for `TraversableOnce` with 2.12 and one for `IterableOnce`
with 2.13. I was able to get it compiling with this strategy but am open to
other suggestions.
#4 - Collection conversion has changed.
The collection compat library will mitigate the difficulty in making this
change, specifically from using `collection.to[Type]` to using
`collection.to(Type)`. Probably safe to also include the change for switching
from `mutable.MutableList` to `mutable.ListBuffer` here since its surface area
is not as user-facing.
#5 - Code generation has changed.
Like previously mentioned, the migration from `CanBuildFrom` to `Factory` /
`BuildFrom` will be necessary, but shouldn't be large, since it's contained to
the module `flink-scala` from what I could tell.
#6 - `Seq` in 2.12 and 2.13 mean different things.
2.12 refers to the mutable version, while 2.13 refers to the immutable
version. This is also present in varargs Scala APIs, such as
`env.fromCollection("one", "two")`. This poses a problem, as `Seq` is used in
each of the Scala modules and is quite prevalent in user-facing APIs. Although
the ambiguity for varargs between versions can't be avoided, we could choose to
disambiguate typical `Seq` use for 2.12 and 2.13 by enforcing consistent usage.
This is likely where most of the work will reside depending on the strategy
chosen. The aforementioned migration doc provides a few options, such as
internally shadowing `Seq` to make unqualified use illegal (i.e. only allow
`immutable.Seq` or `collection.Seq`, not `Seq`). I did notice some issues with
the type serializer though with using qualified versions (`immutable.Seq`), so
perhaps we want to consider shading `Seq` to always mean one version? Either
way, this is a large change and should be discussed early.
Overall, the changes for points 1-5 are relatively easy when separated, while
#6 will likely be a larger change considering how much of the blink table code
is written in Scala. Open to any and all suggestions. :)
> Add support for Scala 2.13
> --------------------------
>
> Key: FLINK-13414
> URL: https://issues.apache.org/jira/browse/FLINK-13414
> Project: Flink
> Issue Type: New Feature
> Components: API / Scala
> Reporter: Chaoran Yu
> Priority: Major
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)