Re: Multiplexing sockets in DFSClient/datanodes?

Raghu Angadi Sat, 15 Mar 2008 22:31:11 -0700

There are many resources consumed by an open dfs file : fds, sockets,socket buffers, threads ... etc.

Better questions to consider might be "How do we support very largenumber of open files in HDFS"?, which, I think opens it up to differenttypes of solutions than one. And "what compromises (if there are any)are ok to achieve this?".

I know its a serious problem for HBase and every fix, incremental ornot, helps. Having a short write write timeout on DataNode in currenttrunk will help greatly with DataNode side (threads and sockets). Ofcourse, we need to make write timeout configurable, which is trivial.

One connection between every client and datanode might not as scalableon a large cluster. Say the cluster has 3000 datanodes and a client has5000 files open to essentially to random datanodes. Then the number ofconnections from client is still in thousands (same problem as now).


Raghu.

dhruba Borthakur wrote:

Hi Jim,

Oh, I see. This does not sound too difficult. One can use the connection
pooling code from the RPC layer. The DFS Client can use the pool to
cache open connections.  Also, I assumed that this connection pooling is
enabled only for block reads and not for block writes.

Would you like to open a JIRA so that we can discuss it in more detail?

Thanks,
dhruba

-----Original Message-----

From: Jim Kellerman [mailto:[EMAIL PROTECTED]Sent: Friday, March 14, 2008 1:01 PM

To: [email protected]; [EMAIL PROTECTED]
Subject: RE: Multiplexing sockets in DFSClient/datanodes?

I'm not suggesting doing simultaneous transfers, just having one
connection between any one client and any one data node. My thinking was
each transfer would be queued and then processed one at a time.

This is a big problem for us. On our cluster at Powerset, we have had
both datanodes and HBase region servers run out of file handles because
there is one open per file.

As HBase installations get larger one socket per file just won't scale.

---
Jim Kellerman, Senior Engineer; Powerset

-----Original Message-----
From: dhruba Borthakur [mailto:[EMAIL PROTECTED]
Sent: Friday, March 14, 2008 10:53 AM
To: [email protected]; [EMAIL PROTECTED]
Subject: RE: Multiplexing sockets in DFSClient/datanodes?

Hi Jim,

The protocol between the client and the Datanodes will become
relatively more complex if we decide to multiplex
simultaneous transfers of multiple blocks on the same socket
connection. Do you think that the benefit of saving on system
resources is really appreciable?

Thanks,
Dhruba

-----Original Message-----
From: Sanjay Radia [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 12, 2008 11:36 AM
To: [EMAIL PROTECTED]
Subject: Re: Multiplexing sockets in DFSClient/datanodes?

Doug Cutting wrote:

Jim Kellerman wrote:

Yes, multiplexing a socket is more complicated than having

one socket

per file, but saving system resources seems like a way to scale.

Questions? Comments? Opinions? Flames?

Note that Hadoop RPC already multiplexes, sharing a single

socket per

pair of JVMs.  It would be possible to multiplex datanode,

and should

not in theory significantly impact performance, but, as you

indicate,

it would be a significant change.  One approach might be to

implement

HDFS data access using RPC rather than directly using stream i/o.

RPC also tears down idle connections, which HDFS does not.

I wonder

how much doing that alone might help your case?  That would

probably

be much simpler to implement.  Both client and server must already
handle connection failures, so it shouldn't be too great of

a change

to have one or both sides actively close things down if

they're idle

for more than a few seconds.  This is related to adding

write timeouts

to the datanode (HADOOP-2346).

Doug,
   Dhruba and I had discussed using RPC in the past. While
RPC is a cleaner interface and our rpc implementation has
features such sharing connection, closing idle connections
etc, streaming IO lets to pipe large amounts of data without
the request/response exchange.
The worry was that IO performance would degrade.
BTW, NFS uses rpc (NFS does not have the write pipeline for replicas)

sanjay

Doug


No virus found in this incoming message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.21.7/1329 - Release
Date: 3/14/2008 12:33 PM


No virus found in this outgoing message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.21.7/1329 - Release Date:
3/14/2008 12:33 PM

Re: Multiplexing sockets in DFSClient/datanodes?

Reply via email to