[ 
https://issues.apache.org/jira/browse/SPARK-48292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17896990#comment-17896990
 ] 

Grisha Weintraub commented on SPARK-48292:
------------------------------------------

I asked in the PR ([https://github.com/apache/spark/pull/46696),] but trying 
here as well.

We are currently using EMR-6.13.0 with Spark 3.4.1 and are experiencing issues 
related to the "Authorized committer error". I know there are fixes available 
in Spark versions 3.4.4 and 3.5.2, but neither of these versions is currently 
available on the EMR platform.

As a workaround, we are considering disabling the OutputCommitCoordinator by 
setting "spark.hadoop.outputCommitCoordination.enabled" to "false".

My question is, if we are willing to accept occasional duplicates, is it safe 
to disable the OutputCommitCoordinator? Our main concern is the possibility of 
data loss — could that occur if we disable this feature?

> Revert [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage 
> when committed file not consistent with task status
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-48292
>                 URL: https://issues.apache.org/jira/browse/SPARK-48292
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 4.0.0
>            Reporter: L. C. Hsieh
>            Assignee: angerszhu
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 4.0.0, 3.5.2, 3.4.4
>
>
> When a task attemp fails but it is authorized to do task commit, 
> OutputCommitCoordinator will make the stage failed with a reason message 
> which says that task commit success, but actually the driver never knows if a 
> task commit is successful or not. We should update the reason message to make 
> it less confused.
> See https://github.com/apache/spark/pull/36564#discussion_r1598660630



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to