bq. same case with sc.parallelize() or sc.makeRDD() I think so.
On Tue, Dec 29, 2015 at 10:50 AM, Gokula Krishnan D <email2...@gmail.com> wrote: > Ted - Thanks for the updates. Then its the same case with sc.parallelize() > or sc.makeRDD() right. > > Thanks & Regards, > Gokula Krishnan* (Gokul)* > > On Tue, Dec 29, 2015 at 1:43 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> From RDD.scala : >> >> def ++(other: RDD[T]): RDD[T] = withScope { >> this.union(other) >> >> They should be the same. >> >> On Tue, Dec 29, 2015 at 10:41 AM, email2...@gmail.com < >> email2...@gmail.com> wrote: >> >>> Hello All - >>> >>> tried couple of operations by using ++ and union on RDD's but realized >>> that >>> the end results are same. Do you know any differences?. >>> >>> val odd_partA = List(1,3,5,7,9,11,1,3,5,7,9,11,1,3,5,7,9,11) >>> odd_partA: List[Int] = List(1, 3, 5, 7, 9, 11, 1, 3, 5, 7, 9, 11, 1, 3, >>> 5, >>> 7, 9, 11) >>> >>> val odd_partB = List(1,3,13,15,9) >>> odd_partB: List[Int] = List(1, 3, 13, 15, 9) >>> >>> val odd_partC = List(15,9,1,3,13) >>> odd_partC: List[Int] = List(15, 9, 1, 3, 13) >>> >>> val odd_partA_RDD = sc.parallelize(odd_partA) >>> odd_partA_RDD: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[9] >>> at >>> parallelize at <console>:17 >>> >>> val odd_partB_RDD = sc.parallelize(odd_partB) >>> odd_partB_RDD: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[10] >>> at >>> parallelize at <console>:17 >>> >>> val odd_partC_RDD = sc.parallelize(odd_partC) >>> odd_partC_RDD: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[11] >>> at >>> parallelize at <console>:17 >>> >>> val odd_PARTAB_pp = odd_partA_RDD ++(odd_partB_RDD) >>> odd_PARTAB_pp: org.apache.spark.rdd.RDD[Int] = UnionRDD[12] at >>> $plus$plus at >>> <console>:23 >>> >>> val odd_PARTAB_union = odd_partA_RDD.union(odd_partB_RDD) >>> odd_PARTAB_union: org.apache.spark.rdd.RDD[Int] = UnionRDD[13] at union >>> at >>> <console>:23 >>> >>> odd_PARTAB_pp.count >>> res8: Long = 23 >>> >>> odd_PARTAB_union.count >>> res9: Long = 23 >>> >>> val odd_PARTABC_pp = odd_partA_RDD ++(odd_partB_RDD) ++ (odd_partC_RDD) >>> odd_PARTABC_pp: org.apache.spark.rdd.RDD[Int] = UnionRDD[15] at >>> $plus$plus >>> at <console>:27 >>> >>> val odd_PARTABC_union = >>> odd_partA_RDD.union(odd_partB_RDD).union(odd_partC_RDD) >>> odd_PARTABC_union: org.apache.spark.rdd.RDD[Int] = UnionRDD[17] at union >>> at >>> <console>:27 >>> >>> odd_PARTABC_pp.count >>> res10: Long = 28 >>> >>> odd_PARTABC_union.count >>> res11: Long = 28 >>> >>> Thanks >>> Gokul >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/difference-between-and-Union-of-a-RDD-tp25830.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >