[ 
https://issues.apache.org/jira/browse/HADOOP-4386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638605#action_12638605
 ] 

Raghu Angadi commented on HADOOP-4386:
--------------------------------------

> TransferTo and transferFrom are not async operations, but blocking operations.

It is mixed. Disk i/o is blocking, but socket i/o obeys blocking setting of the 
socket. So if you are transferring from a file to socket, read from the file is 
blocking (but not readFully()), and write to the socket is non-blocking. 

> It seems that in order to eliminate extra sockets and threads we're forced to 
> do at least one buffer copy. Am I missing something?

Not necessarily. The main intention in HADOOP-3856 (that could apply here, if 
not initially) is that Datanode will have a fixed number of threads per 
partition, say 5. These threads invoke transferTo(). As long as 5 threads can 
keep the disks busy, this is essentially doing as best as thread-per-connection 
could do. We will of course make '5' configurable with a good default.

will check MR Shuffle protocol to see how it works now.

> RPC support for large data transfers.
> -------------------------------------
>
>                 Key: HADOOP-4386
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4386
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, ipc
>            Reporter: Raghu Angadi
>
> Currently HDFS has a socket level protocol for serving HDFS data to clients. 
> Clients do not use RPCs to read or write data. Fundamentally there is no 
> reason why this data transfer  can not use RPCs.
> This jira is place holder for any porting Datanode transfers to RPC. This 
> topic has been discussed in varying detail many times, the latest being in 
> the context of HADOOP-3856. There are quite a few issues to be resolved both 
> at API level and at implementation level. 
> We should probably copy some of the comments from HADOOP-3856 to here.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to