Thanks a lot! Best Regards, Weiwei
On Sat, Feb 20, 2016 at 11:53 PM, Hemant Bhanawat <hemant9...@gmail.com> wrote: > toDF internally calls sqlcontext.createDataFrame which transforms the RDD > to RDD[InternalRow]. This RDD[InternalRow] is then mapped to a dataframe. > > Type conversions (from scala types to catalyst types) are involved but no > shuffling. > > Hemant Bhanawat <https://www.linkedin.com/in/hemant-bhanawat-92a3811> > www.snappydata.io > > On Sun, Feb 21, 2016 at 11:48 AM, Weiwei Zhang <wzhan...@dons.usfca.edu> > wrote: > >> Hi there, >> >> Could someone explain to me what is behind the scene of rdd.toDF()? More >> importantly, will this step involve a lot of shuffles and cause the surge >> of the size of intermediate files? Thank you. >> >> Best Regards, >> Vivian >> > >