[ 
https://issues.apache.org/jira/browse/HDFS-6214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13971512#comment-13971512
 ] 

Daryn Sharp commented on HDFS-6214:
-----------------------------------

I think the question should be: is flush() necessary for _chunked_ transfers?  
It doesn't flush today so arguably it may  not be necessary.  The non-chunked 
use case is where the combination of buffer-12 byte writes and flushing is 
necessary.

Chunked performance is the baseline for performance.  I double checked with our 
performance people and the change did not affect chunked performance while it 
has significantly boosted non-chunked performance.  I chose to always flush for 
simplicity and reduced complexity.

For other watchers, we have been running in production with this change for 
almost a month.

> Webhdfs has poor throughput for files >2GB
> ------------------------------------------
>
>                 Key: HDFS-6214
>                 URL: https://issues.apache.org/jira/browse/HDFS-6214
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>    Affects Versions: 2.0.0-alpha, 3.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: HDFS-6214.patch
>
>
> For the DN's open call, jetty returns a Content-Length header for files <2GB, 
> and uses chunking for files >2GB.  A "bug" in jetty's buffer handling results 
> in a ~8X reduction in throughput.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to