[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

Venkatasubrahmanian Narayanan (Jira) Thu, 29 Feb 2024 10:50:06 -0800


    [ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17822274#comment-17822274
 ]


Venkatasubrahmanian Narayanan commented on HADOOP-19091:
--------------------------------------------------------

[[email protected]] Will keep all that in mind when I work on the Hadoop 
patches, thanks.

 

Hive uses MRv1 and migrating it to MRv2 would be a lot of effort to switch it 
over to use PathOutputCommitter etc. internally, however, I will see if 
something similar to the factory design can be done with the committer class 
since that is configured explicitly for this design.

 

[~srahman] From the AM logs:

Job UUID job_1708973874189_0073 source JobID

Starting: Task committer attempt_1708973874189_0073_r_000000_1: 
commitJob(job_1708973874189_0073) 2024-02-29 18:05:05,766 [DEBUG] [App Shared 
Pool - #2] |impl.IOStatisticsStoreImpl|: Incrementing counter op_list_files by 
1 with final value 1 2024-02-29 18:05:05,766 [DEBUG] [App Shared Pool - #2] 
|s3a.S3AFileSystem|: 
listFiles(s3a://hive-east-1-bucket/emblembasic/__magic/__magic/job-job_1708973874189_0073,
 false) 2024-02-29 18:05:05,767 [DEBUG] [App Shared Pool - #2] 
|s3a.S3AFileSystem|: Requesting all entries under 
emblembasic/__magic/__magic/job-job_1708973874189_0073/ with delimiter '/'

>From the task logs:

Job UUID job_17089738741890_0073 source JobID

Saving work of attempt_17089738741890_0073_r_000000_0 to 
s3a://hive-east-1-bucket/emblembasic/__magic/__magic/job-job_17089738741890_0073/task_17089738741890_0073_r_000000.pendingset

 

It's a very subtle difference(there's an extra 0 in the ID/path used by the 
task). The post-commitJob cleanup does delete the files since it deletes 
everything under the __magic directory instead of looking under the job dir, 
commitJob itself just fails to find the pending set when it lists the files so 
it doesn't commit the results..

> Add support for Tez to MagicS3GuardCommitter
> --------------------------------------------
>
>                 Key: HADOOP-19091
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19091
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 3.3.6
>         Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>            Reporter: Venkatasubrahmanian Narayanan
>            Assignee: Venkatasubrahmanian Narayanan
>            Priority: Major
>         Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

Reply via email to