[ https://issues.apache.org/jira/browse/SPARK-25737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen updated SPARK-25737: ------------------------------ Environment: (was: In ancient times in 2013, JavaSparkContext got a superclass JavaSparkContextVarargsWorkaround to deal with some Scala 2.7 issue: [http://www.scala-archive.org/Workaround-for-implementing-java-varargs-in-2-7-2-final-td1944767.html#a1944772] I believe this was really resolved by the {{@varags}} annotation in Scala 2.9. I believe we can now remove this workaround. Along the way, I think we can also avoid the duplicated definitions of {{union()}}. Where we should be able to just have one varargs method, we have up to 3 forms: - {{union(RDD, Seq/List)}} - {{union(RDD*)}} - {{union(RDD, RDD*)}} While this pattern is sometimes used to avoid type collision due to erasure, I don't think it applies here. After cleaning it, we'll have 1 SparkContext and 3 JavaSparkContext methods (for the 3 Java RDD types), not 11 methods. The only difference for callers in Spark 3 would be that {{sc.union(Seq(rdd1, rdd2))}} now has to be {{sc.union(rdd1, rdd2)}} (simpler) or {{sc.union(Seq(rdd1, rdd2): _*)}}) Description: In ancient times in 2013, JavaSparkContext got a superclass JavaSparkContextVarargsWorkaround to deal with some Scala 2.7 issue: [http://www.scala-archive.org/Workaround-for-implementing-java-varargs-in-2-7-2-final-td1944767.html#a1944772] I believe this was really resolved by the {{@varags}} annotation in Scala 2.9. I believe we can now remove this workaround. Along the way, I think we can also avoid the duplicated definitions of {{union()}}. Where we should be able to just have one varargs method, we have up to 3 forms: - {{union(RDD, Seq/List)}} - {{union(RDD*)}} - {{union(RDD, RDD*)}} While this pattern is sometimes used to avoid type collision due to erasure, I don't think it applies here. After cleaning it, we'll have 1 SparkContext and 3 JavaSparkContext methods (for the 3 Java RDD types), not 11 methods. The only difference for callers in Spark 3 would be that {{sc.union(Seq(rdd1, rdd2))}} now has to be {{sc.union(rdd1, rdd2)}} (simpler) or {{sc.union(Seq(rdd1, rdd2): _*)}} > Remove JavaSparkContextVarargsWorkaround and standardize union() methods > ------------------------------------------------------------------------ > > Key: SPARK-25737 > URL: https://issues.apache.org/jira/browse/SPARK-25737 > Project: Spark > Issue Type: Task > Components: Spark Core > Affects Versions: 3.0.0 > Reporter: Sean Owen > Assignee: Sean Owen > Priority: Minor > > In ancient times in 2013, JavaSparkContext got a superclass > JavaSparkContextVarargsWorkaround to deal with some Scala 2.7 issue: > [http://www.scala-archive.org/Workaround-for-implementing-java-varargs-in-2-7-2-final-td1944767.html#a1944772] > I believe this was really resolved by the {{@varags}} annotation in Scala > 2.9. > I believe we can now remove this workaround. Along the way, I think we can > also avoid the duplicated definitions of {{union()}}. Where we should be able > to just have one varargs method, we have up to 3 forms: > - {{union(RDD, Seq/List)}} > - {{union(RDD*)}} > - {{union(RDD, RDD*)}} > While this pattern is sometimes used to avoid type collision due to erasure, > I don't think it applies here. > After cleaning it, we'll have 1 SparkContext and 3 JavaSparkContext methods > (for the 3 Java RDD types), not 11 methods. > The only difference for callers in Spark 3 would be that {{sc.union(Seq(rdd1, > rdd2))}} now has to be {{sc.union(rdd1, rdd2)}} (simpler) or > {{sc.union(Seq(rdd1, rdd2): _*)}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org