Hi Guang, Keep in mind the data is being encrypted over SSL. If you disable SSL you will most likely see a very significant boost in throughput. Some people have used more powerful computers to make encryption quicker.
Thanks, David From: Sean Roberts <srobe...@hortonworks.com> Reply-To: "user@knox.apache.org" <user@knox.apache.org> Date: Tuesday, September 4, 2018 at 1:53 AM To: "user@knox.apache.org" <user@knox.apache.org> Subject: Re: WebHDFS performance issue in Knox Guang – This is somewhat to be expected. When you talk to WebHDFS directly, the client can distribute the request across many data nodes. Also, you are getting data directly from the source. With Knox, all traffic goes through the single Knox host. Knox is responsible for fetching from the datanodes and consolidating to send to you. This means overhead as it’s acting as a middle man, and lower network capacity since only 1 host is serving data to you. Also, if running on a cloud provider, the Knox host may be a smaller instance size with lower network capacity. -- Sean Roberts From: Guang Yang <k...@uber.com> Reply-To: "user@knox.apache.org" <user@knox.apache.org> Date: Tuesday, 4 September 2018 at 07:46 To: "user@knox.apache.org" <user@knox.apache.org> Subject: WebHDFS performance issue in Knox Hi, We're using Knox 1.1.0 to proxy WebHDFS request. If we download a file through WebHDFS in Knox, the download speed is just about 11M/s. However, if we download directly from datanode, the speed is about 40M/s at least. Are you guys aware of this problem? Any suggestion? Thanks, Guang