I really don't think that kind of difference should be expected from merely SSL overhead. I don't however have any metrics to contradict it either since I do not run Knox without SSL.
Given the above, I am struggling coming up with a meaningful response to this. :( I don't think you should see a 10 fold increase in speed by disabling SSL though. On Tue, Sep 11, 2018 at 2:35 PM Guang Yang <k...@uber.com> wrote: > Any idea guys? > > On Mon, Sep 10, 2018 at 3:07 PM, Guang Yang <k...@uber.com> wrote: > >> Thanks guys! The issue seems exactly what David pointed out, which is >> because of encrypted over SSL. >> >> Without Knox, the download speed can reach to *400M/s* if I call >> Namenode directly. And with disabling SSL, the speed can reach to >> *~400M/s* as well through Knox. But with SSL, the speed drops >> significantly to *~40M/s*. I know it's because of encrypted, but it does >> surprised me with such a difference. Is it normal from your perspective? >> >> Thanks, >> Guang >> >> On Tue, Sep 4, 2018 at 11:07 AM, David Villarreal < >> dvillarr...@hortonworks.com> wrote: >> >>> Hi Guang, >>> >>> >>> >>> Keep in mind the data is being encrypted over SSL. If you disable SSL >>> you will most likely see a very significant boost in throughput. Some >>> people have used more powerful computers to make encryption quicker. >>> >>> >>> >>> Thanks, >>> >>> >>> >>> David >>> >>> >>> >>> *From: *Sean Roberts <srobe...@hortonworks.com> >>> *Reply-To: *"user@knox.apache.org" <user@knox.apache.org> >>> *Date: *Tuesday, September 4, 2018 at 1:53 AM >>> *To: *"user@knox.apache.org" <user@knox.apache.org> >>> *Subject: *Re: WebHDFS performance issue in Knox >>> >>> >>> >>> Guang – This is somewhat to be expected. >>> >>> >>> >>> When you talk to WebHDFS directly, the client can distribute the request >>> across many data nodes. Also, you are getting data directly from the source. >>> >>> With Knox, all traffic goes through the single Knox host. Knox is >>> responsible for fetching from the datanodes and consolidating to send to >>> you. This means overhead as it’s acting as a middle man, and lower network >>> capacity since only 1 host is serving data to you. >>> >>> >>> >>> Also, if running on a cloud provider, the Knox host may be a smaller >>> instance size with lower network capacity. >>> >>> -- >>> >>> Sean Roberts >>> >>> >>> >>> *From: *Guang Yang <k...@uber.com> >>> *Reply-To: *"user@knox.apache.org" <user@knox.apache.org> >>> *Date: *Tuesday, 4 September 2018 at 07:46 >>> *To: *"user@knox.apache.org" <user@knox.apache.org> >>> *Subject: *WebHDFS performance issue in Knox >>> >>> >>> >>> Hi, >>> >>> >>> >>> We're using Knox 1.1.0 to proxy WebHDFS request. If we download a file >>> through WebHDFS in Knox, the download speed is just about 11M/s. However, >>> if we download directly from datanode, the speed is about 40M/s at least. >>> >>> >>> >>> Are you guys aware of this problem? Any suggestion? >>> >>> >>> >>> Thanks, >>> >>> Guang >>> >> >> >