[jira] [Commented] (HADOOP-12109) Distcp of file > 5GB to swift fails with HTTP 413 error

Chen He (JIRA) Fri, 09 Oct 2015 11:07:25 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-12109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14950891#comment-14950891
 ]


Chen He commented on HADOOP-12109:
----------------------------------

This is because current swift code do a copy and a rename process when upload a 
file (>5GB) to object store. We should avoid the rename process if not it will 
always complain HTTP 413. 

> Distcp of file > 5GB to swift fails with HTTP 413 error
> -------------------------------------------------------
>
>                 Key: HADOOP-12109
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12109
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/swift
>    Affects Versions: 2.6.0, 2.7.1
>            Reporter: Phil D'Amore
>
> Trying to use distcp to copy a file more than 5GB to swift fs results in a 
> stack like the following:
> 15/06/01 20:58:57 ERROR util.RetriableCommand: Failure in Retriable command: 
> Copying hdfs://xxx:8020/path/to/random-5Gplus.dat to swift://xxx/5Gplus.dat
> Invalid Response: Method COPY on 
> http://xxx:8080/v1/AUTH_fb7a8901dd8d4c8dba27f5e5d55a46a9/test/.distcp.tmp.attempt_local1097967418_0001_m_000000_0
>  failed, status code: 413, status line: HTTP/1.1 413 Request Entity Too Large 
>  COPY 
> http://xxx:8080/v1/AUTH_fb7a8901dd8d4c8dba27f5e5d55a46a9/test/.distcp.tmp.attempt_local1097967418_0001_m_000000_0
>  => 413 : <html><h1>Request Entity Too Large</h1><p>The body of your request 
> was too large for this server.</p></html>
>         at 
> org.apache.hadoop.fs.swift.http.SwiftRestClient.buildException(SwiftRestClient.java:1502)
>         at 
> org.apache.hadoop.fs.swift.http.SwiftRestClient.perform(SwiftRestClient.java:1403)
>         at 
> org.apache.hadoop.fs.swift.http.SwiftRestClient.copyObject(SwiftRestClient.java:923)
>         at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.copyObject(SwiftNativeFileSystemStore.java:765)
>         at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.rename(SwiftNativeFileSystemStore.java:617)
>         at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem.rename(SwiftNativeFileSystem.java:577)
>         at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.promoteTmpToTarget(RetriableFileCopyCommand.java:220)
>         at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:137)
>         at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:100)
>         at 
> org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
>         at 
> org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:280)
>         at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:252)
>         at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:50)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> It looks like the problem actually occurs in the rename operation which 
> happens after the copy.  The rename is implemented as a copy/delete, and this 
> secondary copy looks like it's not done in a way that breaks up the file into 
> smaller chunks.  
> It looks like the following bug:
> https://bugs.launchpad.net/sahara/+bug/1428941
> It does not look like the fix for this is incorporated into hadoop's swift 
> client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12109) Distcp of file > 5GB to swift fails with HTTP 413 error

Reply via email to