[ 
https://issues.apache.org/jira/browse/HDFS-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271156#comment-13271156
 ] 

Eli Collins commented on HDFS-3148:
-----------------------------------

Sanjay, good questions.
This is motivated for a use case where the client is outside the Hadoop 
cluster, specifically for the case of a system co-located with the Hadoop 
cluster where individual hosts have strong connectivity, eg integration with a 
DB that has multiple high-bandwidth interfaces to use for data import/export. 
This patch has been tested on a system with 4 dual port Infiniband cards, 
Hadoop clients running on this host can use the available bandwidth when 
accessing data on the Hadoop cluster. The Hadoop client in this case is 
configured with 4 interfaces (each representing a bond of the two ports). The 
co-located DB use case is mentioned in the design doc, but not explicitly in 
section 2.5, I'll update it.
                
> The client should be able to use multiple local interfaces for data transfer
> ----------------------------------------------------------------------------
>
>                 Key: HDFS-3148
>                 URL: https://issues.apache.org/jira/browse/HDFS-3148
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs client
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0, 2.0.0
>
>         Attachments: hdfs-3148-b1.txt, hdfs-3148-b1.txt, hdfs-3148.txt, 
> hdfs-3148.txt, hdfs-3148.txt
>
>
> HDFS-3147 covers using multiple interfaces on the server (Datanode) side. 
> Clients should also be able to utilize multiple *local* interfaces for 
> outbound connections instead of always using the interface for the local 
> hostname. This can be accomplished with a new configuration parameter 
> ({{dfs.client.local.interfaces}}) that accepts a list of interfaces the 
> client should use. Acceptable configuration values are the same as the 
> {{dfs.datanode.available.interfaces}} parameter. The client binds its socket 
> to a specific interface, which enables outbound traffic to use that 
> interface. Binding the client socket to a specific address is not sufficient 
> to ensure egress traffic uses that interface. Eg if multiple interfaces are 
> on the same subnet the host requires IP rules that use the source address 
> (which bind sets) to select the destination interface. The SO_BINDTODEVICE 
> socket option could be used to select a specific interface for the connection 
> instead, however it requires JNI (is not in Java's SocketOptions) and root 
> access, which we don't want to require clients have.
> Like HDFS-3147, the client can use multiple local interfaces for data 
> transfer. Since the client already cache their connections to DNs choosing a 
> local interface at random seems like a good policy. Users can also pin a 
> specific client to a specific interface by specifying just that interface in 
> dfs.client.local.interfaces.
> This change was discussed in HADOOP-6210 a while back, and is actually 
> useful/independent of the other HDFS-3140 changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to