fcvr1010 edited a comment on issue #2991: URL: https://github.com/apache/iceberg/issues/2991#issuecomment-909209277
I think I'm also running into the same problem. Setup is similar, EMR version 6.2.0 (so Spark 3.0.1), Iceberg 0.12, and I'm using the Glue Catalog with DynamoDB for locking as per the [Iceberg docs](https://iceberg.apache.org/aws/#iceberg-aws-integrations). The job is a PySpark job. Full error is the following. From `stderr` ```text ERROR TaskSetManager Task 0 in stage 3.0 failed 4 times; aborting job ERROR AppendDataExec Data source write support IcebergBatchWrite(table=my_table, format=PARQUET) is aborting. ERROR AppendDataExec Data source write support IcebergBatchWrite(table=my_table, format=PARQUET) aborted. ERROR ApplicationMaster User application exited with status 1 ``` and `stdout` ```text py4j.protocol.Py4JJavaError: An error occurred while calling o147.saveAsTable. : org.apache.spark.SparkException: Writing job aborted. [...] Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 4 times, most recent failure: Lost task 0.3 in stage 3.0 (TID 6, ip-1-2-3-4.eu-central-1.compute.internal, executor 3): java.io.UncheckedIOException: Failed to create output stream for location: s3://my_table_prefix/data/00000-6-ae5ed834-17b9-423f-b650-7c2783ebf81b-00001.parquet at org.apache.iceberg.aws.s3.S3OutputFile.createOrOverwrite(S3OutputFile.java:64) at org.apache.iceberg.parquet.ParquetIO$ParquetOutputFile.createOrOverwrite(ParquetIO.java:153) [...] Caused by: java.io.IOException: No such file or directory at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.iceberg.aws.s3.S3OutputStream.newStream(S3OutputStream.java:178) at org.apache.iceberg.aws.s3.S3OutputStream.<init>(S3OutputStream.java:114) at org.apache.iceberg.aws.s3.S3OutputFile.createOrOverwrite(S3OutputFile.java:62) ... 28 more Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2215) [...] Caused by: java.io.UncheckedIOException: Failed to create output stream for location: s3://my_table_prefix/data/00000-6-ae5ed834-17b9-423f-b650-7c2783ebf81b-00001.parquet [...] Caused by: java.io.IOException: No such file or directory at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.iceberg.aws.s3.S3OutputStream.newStream(S3OutputStream.java:178) at org.apache.iceberg.aws.s3.S3OutputStream.<init>(S3OutputStream.java:114) at org.apache.iceberg.aws.s3.S3OutputFile.createOrOverwrite(S3OutputFile.java:62) ... 28 more ``` The Iceberg and related JARs are backed into the Amazon Machine Image we use with code similar to the Spark example in [the docs](https://iceberg.apache.org/aws/#iceberg-aws-integrations). We also have multiple [core nodes](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-master-core-task-nodes.html) but I don't think the issue from the core nodes themselves. In fact, I observed that it happens in the following scenario: driver from an application A and executor from application B run concurrently on the same node. The executor fails with the above message. I verified that the node is one of the _task_ nodes, not the _core_ ones. I tried to add `-Djava.io.tmpdir=/tmp/driver/` to ` spark.driver.extraJavaOptions` but didn't help in my case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org