My periodically running process writes data to a table over parquet files with the configuration"spark.sql.sources.partitionOverwriteMode" = "dynamic"with the following code:
if (!tableExists) { df.write .mode( "overwrite" ) .partitionBy( "partitionCol" ) .format( "parquet" ) .saveAsTable( "tablename" ) } else { df.write .format( "parquet" ) .mode( "overwrite" ) .insertInto( "table" ) } If the table doesn't exist and is created in the first clause, it works fine and on the next run when the table does exist and the else clause runs it works as expected. However, when I create the table over existing parquet files either through a hive session or usingspark.sql("CREATE TABLE...")and then run the process it fails to write with the error: "org.apache.spark.SparkException: Dynamic partition strict mode requires at least one static partition column. To turn this off set hive.exec.dynamic.partition.mode=nonstrict" Adding this configuration to the spark conf solves the issue but I don't understand why it is needed when creating the table through a command but isn't needed when creating the table with saveAsTable. Also, I don't understand how this configuration is relevant for spark.[From what I've read](https://cwiki.apache.org/confluence/display/hive/tutorial#Tutorial-Dynamic-PartitionInsert), static partition here means we directly specify the partition to write into instead of specifying the column to partition by. Is it even possible to do such an insert in spark (as opposed to HiveQL)? Spark 2.4, Hadoop 3.1 thanks