[GitHub] [spark] LantaoJin edited a comment on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table

GitBox Tue, 15 Oct 2019 00:31:26 -0700

LantaoJin edited a comment on issue #25840: [SPARK-29166][SQL] Add parameters 
to limit the number of dynamic partitions for data source table
URL: https://github.com/apache/spark/pull/25840#issuecomment-542079294
 
 
   Below testing is under non-hive metastore.
   ```shell
   $ rm -fr metastore_db/
   $ bin/spark-sql --master local --conf 
spark.sql.catalogImplementation=in-memory --conf 
spark.sql.dynamic.partition.maxPartitions=3 --conf 
spark.sql.sources.partitionOverwriteMode=DYNAMIC
   spark-sql> create table data_source_dynamic_partition(i int, part1 int, 
part2 int) using parquet partitioned by (part1, part2);
   spark-sql> insert overwrite table data_source_dynamic_partition 
partition(part1, part2) select 1, 2, id from range(5);
   
   19/10/15 15:26:32 ERROR Utils: Aborting task
   org.apache.spark.SparkException: Total number of dynamic partitions created 
is 4, which is more than 3. To solve this try to increase 
spark.sql.dynamic.partition.maxPartitions
        at 
org.apache.spark.sql.execution.datasources.SQLHadoopMapReduceCommitProtocol.newTaskTempFile(SQLHadoopMapReduceCommitProtocol.scala:86)
        at 
org.apache.spark.sql.execution.datasources.DynamicPartitionDataWriter.newOutputWriter(FileFormatDataWriter.scala:234)
        at 
org.apache.spark.sql.execution.datasources.DynamicPartitionDataWriter.write(FileFormatDataWriter.scala:261)
        at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$1(FileFormatWriter.scala:273)
        at 
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)
        at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:270)
        at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$15(FileFormatWriter.scala:205)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:127)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:455)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:458)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   19/10/15 15:26:32 WARN FileOutputCommitter: Could not delete 
file:/tmp/tmp-warehouse/data_source_dynamic_partition/_temporary/0/_temporary/attempt_20191015152631_0000_m_000000_0
   19/10/15 15:26:32 ERROR FileFormatWriter: Job job_20191015152631_0000 
aborted.
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] LantaoJin edited a comment on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table

Reply via email to