Just split the single rdd into multiple individual rdds using a filter
operation and then convert each individual rdds to it's respective
dataframe..

On Thu, Feb 27, 2020, 7:29 AM Manjunath Shetty H <manjunathshe...@live.com>
wrote:

>
> Hello All,
>
> In spark i am creating the custom partitions with Custom RDD, each
> partition will have different schema. Now in the transformation step we
> need to get the schema and run some Dataframe SQL queries per partition,
> because each partition data has different schema.
>
> How to get the Dataframe's per partition of a RDD?.
>
> As of now i am doing foreachPartition on RDD and converting Iterable<Row>
> to List and converting that to Dataframe. But the problem is converting
> Iterable to List will bring all the data to memory and it might crash the
> process.
>
> Is there any known way to do this ? or is there any way to handle Custom
> Partitions in Dataframes instead of using RDD ?
>
> I am using Spark version 1.6.2.
>
> Any pointers would be helpful. Thanks in advance
>
>

Reply via email to