bq. same case with sc.parallelize() or sc.makeRDD()

I think so.

On Tue, Dec 29, 2015 at 10:50 AM, Gokula Krishnan D <email2...@gmail.com>
wrote:

> Ted - Thanks for the updates. Then its the same case with sc.parallelize()
> or sc.makeRDD() right.
>
> Thanks & Regards,
> Gokula Krishnan* (Gokul)*
>
> On Tue, Dec 29, 2015 at 1:43 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> From RDD.scala :
>>
>>   def ++(other: RDD[T]): RDD[T] = withScope {
>>     this.union(other)
>>
>> They should be the same.
>>
>> On Tue, Dec 29, 2015 at 10:41 AM, email2...@gmail.com <
>> email2...@gmail.com> wrote:
>>
>>> Hello All -
>>>
>>> tried couple of operations by using ++ and union on RDD's but realized
>>> that
>>> the end results are same. Do you know any differences?.
>>>
>>> val odd_partA  = List(1,3,5,7,9,11,1,3,5,7,9,11,1,3,5,7,9,11)
>>> odd_partA: List[Int] = List(1, 3, 5, 7, 9, 11, 1, 3, 5, 7, 9, 11, 1, 3,
>>> 5,
>>> 7, 9, 11)
>>>
>>> val odd_partB  = List(1,3,13,15,9)
>>> odd_partB: List[Int] = List(1, 3, 13, 15, 9)
>>>
>>> val odd_partC  = List(15,9,1,3,13)
>>> odd_partC: List[Int] = List(15, 9, 1, 3, 13)
>>>
>>> val odd_partA_RDD = sc.parallelize(odd_partA)
>>> odd_partA_RDD: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[9]
>>> at
>>> parallelize at <console>:17
>>>
>>> val odd_partB_RDD = sc.parallelize(odd_partB)
>>> odd_partB_RDD: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[10]
>>> at
>>> parallelize at <console>:17
>>>
>>> val odd_partC_RDD = sc.parallelize(odd_partC)
>>> odd_partC_RDD: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[11]
>>> at
>>> parallelize at <console>:17
>>>
>>> val odd_PARTAB_pp = odd_partA_RDD ++(odd_partB_RDD)
>>> odd_PARTAB_pp: org.apache.spark.rdd.RDD[Int] = UnionRDD[12] at
>>> $plus$plus at
>>> <console>:23
>>>
>>> val odd_PARTAB_union = odd_partA_RDD.union(odd_partB_RDD)
>>> odd_PARTAB_union: org.apache.spark.rdd.RDD[Int] = UnionRDD[13] at union
>>> at
>>> <console>:23
>>>
>>> odd_PARTAB_pp.count
>>> res8: Long = 23
>>>
>>> odd_PARTAB_union.count
>>> res9: Long = 23
>>>
>>> val odd_PARTABC_pp = odd_partA_RDD ++(odd_partB_RDD) ++ (odd_partC_RDD)
>>> odd_PARTABC_pp: org.apache.spark.rdd.RDD[Int] = UnionRDD[15] at
>>> $plus$plus
>>> at <console>:27
>>>
>>> val odd_PARTABC_union =
>>> odd_partA_RDD.union(odd_partB_RDD).union(odd_partC_RDD)
>>> odd_PARTABC_union: org.apache.spark.rdd.RDD[Int] = UnionRDD[17] at union
>>> at
>>> <console>:27
>>>
>>> odd_PARTABC_pp.count
>>> res10: Long = 28
>>>
>>> odd_PARTABC_union.count
>>> res11: Long = 28
>>>
>>> Thanks
>>> Gokul
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/difference-between-and-Union-of-a-RDD-tp25830.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>

Reply via email to