[ 
https://issues.apache.org/jira/browse/HDFS-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608878#comment-14608878
 ] 

Xiaobing Zhou commented on HDFS-8696:
-------------------------------------

I investigated many aspects of issue, including disk io queue, NN/DN/client 
packages,  no help. it finally boils down to some options we’ve to explicitly 
set for Netty server.
1. Netty NIO event loop boss/worker threads, by default they are 1 and 2, which 
are too small to handle intensive workloads
2. Though it is platform dependent, ChannelOption.SO_BACKLOG, the maximum queue 
length for incoming connection, default of which is 50, too small to deal with 
high concurrency.
3. These two ChannelOption.SO_SNDBUF and ChannelOption.SO_RCVBUF(sending and 
receiving buffer size) are the key point to this issue. Performance gains 
benefit from proper setting significantly. In this case, up to 30 seconds 
latency will drop to up to 5 in my test env.
4. ChannelOption.WRITE_BUFFER_LOW_WATER_MARK and 
ChannelOption.WRITE_BUFFER_HIGH_WATER_MARK(thresholds to indicate channel is 
writable or not) also matters to drop up to 5 secs latency to less than 1 sec.

Right now, I am making all these configurable and try best to tune so as to 
figure out good HDFS default values to be shipped out.

> Small reads are blocked by large long running reads
> ---------------------------------------------------
>
>                 Key: HDFS-8696
>                 URL: https://issues.apache.org/jira/browse/HDFS-8696
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>    Affects Versions: 2.6.0
>            Reporter: Xiaobing Zhou
>            Assignee: Xiaobing Zhou
>            Priority: Blocker
>
> There is an issue that appears related to the webhdfs server. When making two 
> concurrent requests, the DN will sometimes pause for extended periods (I've 
> seen 1-300 seconds), killing performance and dropping connections. 
> To reproduce: 
> 1. set up a HDFS cluster
> 2. Upload a large file (I was using 10GB). Perform 1-byte reads, writing
> the time out to /tmp/times.txt
> {noformat}
> i=1
> while (true); do 
> echo $i
> let i++
> /usr/bin/time -f %e -o /tmp/times.txt -a curl -s -L -o /dev/null 
> "http://<namenode>:50070/webhdfs/v1/tmp/bigfile?op=OPEN&user.name=root&length=1";
> done
> {noformat}
> 3. Watch for 1-byte requests that take more than one second:
> tail -F /tmp/times.txt | grep -E "^[^0]"
> 4. After it has had a chance to warm up, start doing large transfers from
> another shell:
> {noformat}
> i=1
> while (true); do 
> echo $i
> let i++
> (/usr/bin/time -f %e curl -s -L -o /dev/null 
> "http://<namenode>:50070/webhdfs/v1/tmp/bigfile?op=OPEN&user.name=root");
> done
> {noformat}
> It's easy to find after a minute or two that small reads will sometimes
> pause for 1-300 seconds. In some extreme cases, it appears that the
> transfers timeout and the DN drops the connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to