Re: Behind the scene of RDD to DataFrame

Weiwei Zhang Sun, 21 Feb 2016 09:31:10 -0800

Thanks a lot!

Best Regards,
Weiwei


On Sat, Feb 20, 2016 at 11:53 PM, Hemant Bhanawat <hemant9...@gmail.com>
wrote:

> toDF internally calls sqlcontext.createDataFrame which transforms the RDD
> to RDD[InternalRow]. This RDD[InternalRow] is then mapped to a dataframe.
>
> Type conversions (from scala types to catalyst types) are involved but no
> shuffling.
>
> Hemant Bhanawat <https://www.linkedin.com/in/hemant-bhanawat-92a3811>
> www.snappydata.io
>
> On Sun, Feb 21, 2016 at 11:48 AM, Weiwei Zhang <wzhan...@dons.usfca.edu>
> wrote:
>
>> Hi there,
>>
>> Could someone explain to me what is behind the scene of rdd.toDF()? More
>> importantly, will this step involve a lot of shuffles and cause the surge
>> of the size of intermediate files? Thank you.
>>
>> Best Regards,
>> Vivian
>>
>
>

Re: Behind the scene of RDD to DataFrame

Reply via email to