Actually, PhedEx is using GridFTP for its data transferring.

On Thu, Jan 13, 2011 at 5:34 AM, Steve Loughran <ste...@apache.org> wrote:

> On 13/01/11 08:34, li ping wrote:
>
>> That is also my concerns. Is it efficient for data transmission.
>>
>
> It's long lived TCP connections, reasonably efficient for bulk data xfer,
> has all the throttling of TCP built in, and comes with some excellently
> debugged client and server code in the form of jetty and httpclient. In
> maintenance costs alone, those libraries justify HTTP unless you have a
> vastly superior option *and are willing to maintain it forever*
>
> FTPs limits are well known (security), NFS limits well known (security, UDP
> version doesn't throttle), self developed protocols will have whatever
> problems you want.
>
> There are better protocols for long-haul data transfer over fat pipes, such
> as GridFTP , PhedEX ( http://www.gridpp.ac.uk/papers/ah05_phedex.pdf ),
> which use multiple TCP channels in parallel to reduce the impact of a single
> lost packet, but within a datacentre, you shouldn't have to worry about
> this. If you do find lots of packets get lost, raise the issue with the
> networking team.
>
> -Steve
>
>
>
>> On Thu, Jan 13, 2011 at 4:27 PM, Nan Zhu<zhunans...@gmail.com>  wrote:
>>
>>  Hi, all
>>>
>>> I have a question about the file transmission between Map and Reduce
>>> stage,
>>> in current implementation, the Reducers get the results generated by
>>> Mappers
>>> through HTTP Get, I don't understand why HTTP is selected, why not FTP,
>>> or
>>> a
>>> self-developed protocal?
>>>
>>> Just for HTTP's simple?
>>>
>>> thanks
>>>
>>> Nan
>>>
>>>
>>
>>
>>
>

Reply via email to