[ 
https://issues.apache.org/jira/browse/HADOOP-13211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15309263#comment-15309263
 ] 

Chen He commented on HADOOP-13211:
----------------------------------

Thank you for the reply, [[email protected]]. 

IMHO, the hadoop openstack driver is a bridge between HDFS and Openstack object 
store. MR or other native Hadoop frameworks should be able to utilize the 
Hadoop IPC retry. With the increasing popularity of HDFS, other computing 
frameworks like Spark, in memory storage system like Tachyon, they are using 
hadoop openstack driver. I am not sure if Spark or other frameworks use 
hadoop-openstack driver, the Hadoop IPC retry will trigger or not. 

Those frameworks have retry on task level, however, it could be costly to retry 
a task than just retry in the driver level. 

For the data lose, it is a really good catch. If the server keeps failing and 
providing 5xx, the upload will finally fail. The object store is not file 
system and may not guarantee file system level integrity. I can't figure out a 
scenario that data loss caused by retry. Could you provide an suggestion? 

> Swift driver should have a configurable retry feature when ecounter 5xx error
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-13211
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13211
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/swift
>    Affects Versions: 2.7.2
>            Reporter: Chen He
>            Assignee: Chen He
>
> In current code. if Swift driver meets a HTTP 5xx, it will throw exception 
> and stop. As a driver, it will be more sophisticate if it can retry a 
> configurable times before report failure. There are two reasons that I can 
> image:
> 1. if the server is really busy, it is possible that the server will drop 
> some requests to avoid DDoS attack.
> 2. If server accidentally unavailable for a short period of time and come 
> back again, we may not need to fail the whole driver. Just record the 
> exception and retry may be more flexible. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to