Re: Overwrite Mode not Working Correctly in spark 3.0.0

Piyush Acharya Sun, 19 Jul 2020 10:36:02 -0700

Can you please send the error message? it would ve very helpful to get to
the root cause.


On Sun, Jul 19, 2020 at 10:57 PM anbutech <anbutec...@outlook.com> wrote:

> Hi Team,
>
> I'm facing weird behavior in the pyspark dataframe(databricks delta spark
> 3.0.0 supported)
>
> I have tried the below two options to write the processed datafame data
> into
> delta table with respect to the partition columns in the table.Actually
> overwrite mode completely overwrite the whole table.i couldn't figure it
> out
> why did the dataframe fully overwrite here.
>
> Also i'm getting the following error while testing with below option 2
>
>
> Predicate references non-partition column 'json_feeds_flatten_data'. Only
> the partition columns may be referenced: [table_name, y, m, d, h];
>
> could you please me why did the pyspark behavior like this?.It would be
> very
> helpful to know the mistake here.
>
> sample partition column values:
> -------------------------------
>
> table_name='json_feeds_flatten_data'
> y=2020
> m=7
> d=19
> h=0
>
> Option 1:
>
> partition_keys=['table_name','y','m','d','h']
>
>          (final_df
>           .withColumn('y', lit(y).cast('int'))
>            .withColumn('m', lit(m).cast('int'))
>            .withColumn('d', lit(d).cast('int'))
>            .withColumn('h', lit(h).cast('int'))
>            .write
>            .partitionBy(partition_keys)
>            .format("delta")
>            .mode('overwrite')
>            .saveAsTable(target_table)
>          )
>
> Option 2:
>
> rep_wh = 'table_name={} AND y={} AND m={} AND d={} AND
> h={}'.format(table_name,y, m, d, h)
>         (final_df
>           .withColumn('y', lit(y).cast('int'))
>           .withColumn('m', lit(m).cast('int'))
>           .withColumn('d', lit(d).cast('int'))
>           .withColumn('h', lit(h).cast('int'))
>           .write
>           .format("delta")
>           .mode('overwrite')
>           .option('replaceWhere', rep_wh )
>           .saveAsTable(target_table)
>         )
>
> Thanks
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: Overwrite Mode not Working Correctly in spark 3.0.0

Reply via email to