There are many resources consumed by an open dfs file : fds, sockets, socket buffers, threads ... etc.

Better questions to consider might be "How do we support very large number of open files in HDFS"?, which, I think opens it up to different types of solutions than one. And "what compromises (if there are any) are ok to achieve this?".

I know its a serious problem for HBase and every fix, incremental or not, helps. Having a short write write timeout on DataNode in current trunk will help greatly with DataNode side (threads and sockets). Of course, we need to make write timeout configurable, which is trivial.

One connection between every client and datanode might not as scalable on a large cluster. Say the cluster has 3000 datanodes and a client has 5000 files open to essentially to random datanodes. Then the number of connections from client is still in thousands (same problem as now).

Raghu.

dhruba Borthakur wrote:
Hi Jim,

Oh, I see. This does not sound too difficult. One can use the connection
pooling code from the RPC layer. The DFS Client can use the pool to
cache open connections.  Also, I assumed that this connection pooling is
enabled only for block reads and not for block writes.

Would you like to open a JIRA so that we can discuss it in more detail?

Thanks,
dhruba

-----Original Message-----
From: Jim Kellerman [mailto:[EMAIL PROTECTED] Sent: Friday, March 14, 2008 1:01 PM
To: [email protected]; [EMAIL PROTECTED]
Subject: RE: Multiplexing sockets in DFSClient/datanodes?

I'm not suggesting doing simultaneous transfers, just having one
connection between any one client and any one data node. My thinking was
each transfer would be queued and then processed one at a time.

This is a big problem for us. On our cluster at Powerset, we have had
both datanodes and HBase region servers run out of file handles because
there is one open per file.

As HBase installations get larger one socket per file just won't scale.

---
Jim Kellerman, Senior Engineer; Powerset


-----Original Message-----
From: dhruba Borthakur [mailto:[EMAIL PROTECTED]
Sent: Friday, March 14, 2008 10:53 AM
To: [email protected]; [EMAIL PROTECTED]
Subject: RE: Multiplexing sockets in DFSClient/datanodes?

Hi Jim,

The protocol between the client and the Datanodes will become
relatively more complex if we decide to multiplex
simultaneous transfers of multiple blocks on the same socket
connection. Do you think that the benefit of saving on system
resources is really appreciable?

Thanks,
Dhruba

-----Original Message-----
From: Sanjay Radia [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 12, 2008 11:36 AM
To: [EMAIL PROTECTED]
Subject: Re: Multiplexing sockets in DFSClient/datanodes?

Doug Cutting wrote:
Jim Kellerman wrote:
Yes, multiplexing a socket is more complicated than having
one socket
per file, but saving system resources seems like a way to scale.

Questions? Comments? Opinions? Flames?
Note that Hadoop RPC already multiplexes, sharing a single
socket per
pair of JVMs.  It would be possible to multiplex datanode,
and should
not in theory significantly impact performance, but, as you
indicate,
it would be a significant change.  One approach might be to
implement
HDFS data access using RPC rather than directly using stream i/o.

RPC also tears down idle connections, which HDFS does not.
I wonder
how much doing that alone might help your case?  That would
probably
be much simpler to implement.  Both client and server must already
handle connection failures, so it shouldn't be too great of
a change
to have one or both sides actively close things down if
they're idle
for more than a few seconds.  This is related to adding
write timeouts

to the datanode (HADOOP-2346).
Doug,
   Dhruba and I had discussed using RPC in the past. While
RPC is a cleaner interface and our rpc implementation has
features such sharing connection, closing idle connections
etc, streaming IO lets to pipe large amounts of data without
the request/response exchange.
The worry was that IO performance would degrade.
BTW, NFS uses rpc (NFS does not have the write pipeline for replicas)

sanjay
Doug

No virus found in this incoming message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.21.7/1329 - Release
Date: 3/14/2008 12:33 PM



No virus found in this outgoing message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.21.7/1329 - Release Date:
3/14/2008 12:33 PM


Reply via email to