[ 
https://issues.apache.org/jira/browse/HADOOP-16428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Kaw updated HADOOP-16428:
-------------------------------
    Description: 
Currently, I don't see Distcp make use of S3a Committers, be it Magic or 
Staging and I have noticed most of the jobs which use MapReduce frameworks use 
S3 committers except distcp. Distcp makes use of the FileOutputCommitter even 
if S3a committer parameters are specified in the core-site.xml. Is this by 
design? If yes, can someone please explain the reason for that. Are there any 
limitations or potential risks of using S3a committers with Distcp? 

I know there is a "-direct" option that can be used with the 
FileOutputCommitter in order to avoid renaming while committing fr object 
Stores. But if anyone can put some light on the current limitation of S3a 
committers with distcp and reason for choosing FileOutputCommitters for Distcp 
over S3a committers, it would be helpful.  Thanks

  was:
Currently, I don't see Distcp make use of S3a Committers, be it Magic or 
Staging. It makes use of the FileOutputCommitter even if S3a committer 
parameters are specified in the core-site.xml. Is this by design? If yes, can 
someone please explain the reason for that. Are there any limitations of using 
S3 committers with Distcp and potential risks of using S3a committers with 
Distcp? 

I know there is a "-direct" option that can be used with the 
FileOutputCommitter in order to avoid renaming while committing. But if anyone 
can put some light on the current limitation of S3a committers with distcp, it 
would be helpful. Thanks


> Distcp make use of S3a Committers, be it magic or staging
> ---------------------------------------------------------
>
>                 Key: HADOOP-16428
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16428
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 3.1.1
>            Reporter: Sahil Kaw
>            Priority: Minor
>             Fix For: 3.1.2
>
>
> Currently, I don't see Distcp make use of S3a Committers, be it Magic or 
> Staging and I have noticed most of the jobs which use MapReduce frameworks 
> use S3 committers except distcp. Distcp makes use of the FileOutputCommitter 
> even if S3a committer parameters are specified in the core-site.xml. Is this 
> by design? If yes, can someone please explain the reason for that. Are there 
> any limitations or potential risks of using S3a committers with Distcp? 
> I know there is a "-direct" option that can be used with the 
> FileOutputCommitter in order to avoid renaming while committing fr object 
> Stores. But if anyone can put some light on the current limitation of S3a 
> committers with distcp and reason for choosing FileOutputCommitters for 
> Distcp over S3a committers, it would be helpful.  Thanks



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to