Hi Guang,

Keep in mind the data is being encrypted over SSL.  If you disable SSL you will 
most likely see a very significant boost in throughput.  Some people have used 
more powerful computers to make encryption quicker.

Thanks,

David

From: Sean Roberts <srobe...@hortonworks.com>
Reply-To: "user@knox.apache.org" <user@knox.apache.org>
Date: Tuesday, September 4, 2018 at 1:53 AM
To: "user@knox.apache.org" <user@knox.apache.org>
Subject: Re: WebHDFS performance issue in Knox

Guang – This is somewhat to be expected.

When you talk to WebHDFS directly, the client can distribute the request across 
many data nodes. Also, you are getting data directly from the source.
With Knox, all traffic goes through the single Knox host. Knox is responsible 
for fetching from the datanodes and consolidating to send to you. This means 
overhead as it’s acting as a middle man, and lower network capacity since only 
1 host is serving data to you.

Also, if running on a cloud provider, the Knox host may be a smaller instance 
size with lower network capacity.
--
Sean Roberts

From: Guang Yang <k...@uber.com>
Reply-To: "user@knox.apache.org" <user@knox.apache.org>
Date: Tuesday, 4 September 2018 at 07:46
To: "user@knox.apache.org" <user@knox.apache.org>
Subject: WebHDFS performance issue in Knox

Hi,

We're using Knox 1.1.0 to proxy WebHDFS request. If we download a file through 
WebHDFS in Knox, the download speed is just about 11M/s. However, if we 
download directly from datanode, the speed is about 40M/s at least.

Are you guys aware of this problem? Any suggestion?

Thanks,
Guang

Reply via email to