[
https://issues.apache.org/jira/browse/HDFS-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074837#comment-14074837
]
Colin Patrick McCabe commented on HDFS-223:
-------------------------------------------
I think the existing thread-pool model kind of makes sense for the Datanode.
The DN has to compute checksums, which inevitably chews the CPU. You can't
really chew the CPU in a non-blocking way. Realistically, if you have 10 disks
and 4096 DN threads chugging along at once (the current
{{dfs.datanode.max.transfer.threads}}), you're going to have about 400
simultaneous operations per disk. It seems like the CPU consumption for CRC32
or hard disk bandwidth would become a bottleneck long before the number of I/O
threads was an issue.
Some of the scalability issues here were related to spending too much time
creating and tearing down TCP sockets, I think, and were solved by the socket
cache. Hedged reads also help with some of the DFSClient latency spikes
described here.
I think eventually we'll need to re-evaulate this in light of new technology.
But for right now, it's hard to see how we'd use non-blocking to get better
throughput on the DN (as far as I can see).
> Asynchronous IO Handling in Hadoop and HDFS
> -------------------------------------------
>
> Key: HDFS-223
> URL: https://issues.apache.org/jira/browse/HDFS-223
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Raghu Angadi
> Attachments: GrizzlyEchoServer.patch, MinaEchoServer.patch
>
>
> I think Hadoop needs utilities or framework to make it simpler to deal with
> generic asynchronous IO in Hadoop.
> Example use case :
> Its been a long standing problem that DataNode takes too many threads for
> data transfers. Each write operation takes up 2 threads at each of the
> datanodes and each read operation takes one irrespective of how much activity
> is on the sockets. The kinds of load that HDFS serves has been expanding
> quite fast and HDFS should handle these varied loads better. If there is a
> framework for non-blocking IO, read and write pipeline state machines could
> be implemented with async events on a fixed number of threads.
> A generic utility is better since it could be used in other places like
> DFSClient. DFSClient currently creates 2 extra threads for each file it has
> open for writing.
> Initially I started writing a primitive "selector", then tried to see if such
> facility already exists. [Apache MINA|http://mina.apache.org] seemed to do
> exactly this. My impression after looking the the interface and examples is
> that it does not give kind control we might prefer or need. First use case I
> was thinking of implementing using MINA was to replace "response handlers" in
> DataNode. The response handlers are simpler since they don't involve disk
> I/O. I [asked on MINA user
> list|http://www.nabble.com/Async-events-with-existing-NIO-sockets.-td18640767.html],
> but looks like it can not be done, I think mainly because the sockets are
> already created.
> Essentially what I have in mind is similar to MINA, except that read and
> write of the sockets is done by the event handlers. The lowest layer
> essentially invokes selectors, invokes event handlers on single or on
> multiple threads. Each event handler is is expected to do some non-blocking
> work. We would of course have utility handler implementations that do read,
> write, accept etc, that are useful for simple processing.
> Sam Pullara mentioned that [xSockets|http://xsocket.sourceforge.net/] is more
> flexible. It is under GPL.
> Are there other such implementations we should look at?
--
This message was sent by Atlassian JIRA
(v6.2#6252)