[ 
https://issues.apache.org/jira/browse/SPARK-32742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17187751#comment-17187751
 ] 

Ryan Luo commented on SPARK-32742:
----------------------------------

[~hyukjin.kwon] Thanks for advising.

> FileOutputCommitter warns "No Output found for attempt"
> -------------------------------------------------------
>
>                 Key: SPARK-32742
>                 URL: https://issues.apache.org/jira/browse/SPARK-32742
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.4.0
>         Environment: Hadoop 2.6.0-cdh5.16.2
> YARN(MR2 included)
>  
>            Reporter: Ryan Luo
>            Priority: Major
>
> Hi team,
> This is my first time to report an issue here.
> We submitted and ran the spark job on the cluster. 
> We found that one of the parquet output partition is missing in the output 
> directory. We checked the spark job log, all the tasks status are showing 
> success. The output record size matches expected number.
> However, we checked the container log, found that there was a warning says 
> *No Output found for attempt_20200819094307_0003_m_000002_11*, which stopped 
> moving the output from taskAttemptPath to output directory. As a result, we 
> are missing some of the output rows.
> Re-run the job helped to solve the issue, however the report is critical for 
> us. It is appreciated if you can advise the cause for the issue.
>  
> Below are the container logs:
>  
> {code:java}
> 20/08/19 09:44:57 INFO output.FileOutputCommitter: FileOutputCommitter skip 
> cleanup _temporary folders under output directory:false, ignore cleanup 
> failures: false
> 20/08/19 09:44:57 INFO datasources.SQLHadoopMapReduceCommitProtocol: Using 
> user defined output committer class parquet.hadoop.ParquetOutputCommitter
> 20/08/19 09:44:57 INFO output.FileOutputCommitter: File Output Committer 
> Algorithm version is 2
> 20/08/19 09:44:57 INFO output.FileOutputCommitter: FileOutputCommitter skip 
> cleanup _temporary folders under output directory:false, ignore cleanup 
> failures: false
> 20/08/19 09:44:57 INFO datasources.SQLHadoopMapReduceCommitProtocol: Using 
> output committer class parquet.hadoop.ParquetOutputCommitter
> 20/08/19 09:44:57 INFO codegen.CodeGenerator: Code generated in 12.370642 ms
> 20/08/19 09:44:57 INFO codegen.CodeGenerator: Code generated in 6.927118 ms
> 20/08/19 09:44:57 INFO codegen.CodeGenerator: Code generated in 12.004204 ms
> 20/08/19 09:44:57 INFO parquet.ParquetWriteSupport: Initialized Parquet 
> WriteSupport with Catalyst schema:
> ..... (skipped)
> 20/08/19 09:44:57 WARN output.FileOutputCommitter: No Output found for 
> attempt_20200819094307_0003_m_000002_11
> 20/08/19 09:44:57 INFO mapred.SparkHadoopMapRedUtil: 
> attempt_20200819094307_0003_m_000002_11: Committed
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to