[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17903734#comment-17903734
 ] 

Venkatasubrahmanian Narayanan commented on HADOOP-19091:
--------------------------------------------------------

For the ITestConnectionTimeouts error, I see it consistently even for regular 
connections within the US (connecting from CA to us-east-1).

 

The changes I made to the core-site and core-default xmls are:

core-site.xml, Add:

 
<property>
<name>fs.s3a.bucket.test-aws-s3n.endpoint.region</name>
<value>us-east-1</value>
</property>

<property>
<name>fs.s3a.bucket.test-aws-s3a.endpoint.region</name>
<value>us-east-1</value>
</property>
 
core-default.xml, Replace:
 
<property>
<name>fs.s3a.aws.credentials.provider</name>
<value>
org.apache.hadoop.fs.s3a.ProfileAWSCredentialsProvider
</value>
</property>
 
<property>
<name>fs.s3a.assumed.role.credentials.provider</name>
<value>org.apache.hadoop.fs.s3a.ProfileAWSCredentialsProvider</value>
</property>
 
ProfileAWSCredentialsProvider is a simple wrapper around AWS' 
ProfileCredentialsProvider I wrote that helps me run the tests with my account.

> Add support for Tez to MagicS3GuardCommitter
> --------------------------------------------
>
>                 Key: HADOOP-19091
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19091
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 3.3.6
>         Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>            Reporter: Venkatasubrahmanian Narayanan
>            Assignee: Venkatasubrahmanian Narayanan
>            Priority: Major
>         Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to