Re: Union in Spark context

2018-02-05 Thread Suchith J N
Thank you very much. I had overlooked the differences between the two. The public API part is understandable. Coming to second part. - I see that it creates an instance of UnionRDD with all RDDs as parent there by preventing long lineage chain. Is my understanding correct? On 5 February 2018 at

Re: Union in Spark context

2018-02-05 Thread Mark Hamstra
First, the public API cannot be changed except when there is a major version change, and there is no way that we are going to do Spark 3.0.0 just for this change. Second, the change would be a mistake since the two different union methods are quite different. The method in RDD only ever works on t

Re: Union in Spark context

2018-02-05 Thread 0xF0F0F0
There is one on RDD but `SparkContext.union` prevents lineage from growing. Check https://stackoverflow.com/q/34461804 Sent with [ProtonMail](https://protonmail.com) Secure Email. Original Message On February 5, 2018 5:04 PM, Suchith J N wrote: > Hi, > > Seems like simple cl

Union in Spark context

2018-02-05 Thread Suchith J N
Hi, Seems like simple clean up - Why do we have union() on RDDs in SparkContext? Shouldn't it reside in RDD? There is one in RDD, but it seems like a wrapper around this. Regards, Suchith