[
https://issues.apache.org/jira/browse/HDFS-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608878#comment-14608878
]
Xiaobing Zhou commented on HDFS-8696:
-------------------------------------
I investigated many aspects of issue, including disk io queue, NN/DN/client
packages, no help. it finally boils down to some options we’ve to explicitly
set for Netty server.
1. Netty NIO event loop boss/worker threads, by default they are 1 and 2, which
are too small to handle intensive workloads
2. Though it is platform dependent, ChannelOption.SO_BACKLOG, the maximum queue
length for incoming connection, default of which is 50, too small to deal with
high concurrency.
3. These two ChannelOption.SO_SNDBUF and ChannelOption.SO_RCVBUF(sending and
receiving buffer size) are the key point to this issue. Performance gains
benefit from proper setting significantly. In this case, up to 30 seconds
latency will drop to up to 5 in my test env.
4. ChannelOption.WRITE_BUFFER_LOW_WATER_MARK and
ChannelOption.WRITE_BUFFER_HIGH_WATER_MARK(thresholds to indicate channel is
writable or not) also matters to drop up to 5 secs latency to less than 1 sec.
Right now, I am making all these configurable and try best to tune so as to
figure out good HDFS default values to be shipped out.
> Small reads are blocked by large long running reads
> ---------------------------------------------------
>
> Key: HDFS-8696
> URL: https://issues.apache.org/jira/browse/HDFS-8696
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: webhdfs
> Affects Versions: 2.6.0
> Reporter: Xiaobing Zhou
> Assignee: Xiaobing Zhou
> Priority: Blocker
>
> There is an issue that appears related to the webhdfs server. When making two
> concurrent requests, the DN will sometimes pause for extended periods (I've
> seen 1-300 seconds), killing performance and dropping connections.
> To reproduce:
> 1. set up a HDFS cluster
> 2. Upload a large file (I was using 10GB). Perform 1-byte reads, writing
> the time out to /tmp/times.txt
> {noformat}
> i=1
> while (true); do
> echo $i
> let i++
> /usr/bin/time -f %e -o /tmp/times.txt -a curl -s -L -o /dev/null
> "http://<namenode>:50070/webhdfs/v1/tmp/bigfile?op=OPEN&user.name=root&length=1";
> done
> {noformat}
> 3. Watch for 1-byte requests that take more than one second:
> tail -F /tmp/times.txt | grep -E "^[^0]"
> 4. After it has had a chance to warm up, start doing large transfers from
> another shell:
> {noformat}
> i=1
> while (true); do
> echo $i
> let i++
> (/usr/bin/time -f %e curl -s -L -o /dev/null
> "http://<namenode>:50070/webhdfs/v1/tmp/bigfile?op=OPEN&user.name=root");
> done
> {noformat}
> It's easy to find after a minute or two that small reads will sometimes
> pause for 1-300 seconds. In some extreme cases, it appears that the
> transfers timeout and the DN drops the connection.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)