As with Sean, I'm not sure that this will require a new major version, but we should also be looking at Java 9 & 10 support -- particularly with regard to their better functionality in a containerized environment (memory limits from cgroups, not sysconf; support for cpusets). In that regard, we should also be looking at using the latest Scala 2.11.x maintenance release in current Spark branches.
On Thu, Apr 5, 2018 at 5:45 AM, Sean Owen <sro...@gmail.com> wrote: > On Wed, Apr 4, 2018 at 6:20 PM Reynold Xin <r...@databricks.com> wrote: > >> The primary motivating factor IMO for a major version bump is to support >> Scala 2.12, which requires minor API breaking changes to Spark’s APIs. >> Similar to Spark 2.0, I think there are also opportunities for other >> changes that we know have been biting us for a long time but can’t be >> changed in feature releases (to be clear, I’m actually not sure they are >> all good ideas, but I’m writing them down as candidates for consideration): >> > > IIRC from looking at this, it is possible to support 2.11 and 2.12 > simultaneously. The cross-build already works now in 2.3.0. Barring some > big change needed to get 2.12 fully working -- and that may be the case -- > it nearly works that way now. > > Compiling vs 2.11 and 2.12 does however result in some APIs that differ in > byte code. However Scala itself isn't mutually compatible between 2.11 and > 2.12 anyway; that's never been promised as compatible. > > (Interesting question about what *Java* users should expect; they would > see a difference in 2.11 vs 2.12 Spark APIs, but that has always been true.) > > I don't disagree with shooting for Spark 3.0, just saying I don't know if > 2.12 support requires moving to 3.0. But, Spark 3.0 could consider dropping > 2.11 support if needed to make supporting 2.12 less painful. >