I really don't think that kind of difference should be expected from merely
SSL overhead.
I don't however have any metrics to contradict it either since I do not run
Knox without SSL.

Given the above, I am struggling coming up with a meaningful response to
this. :(
I don't think you should see a 10 fold increase in speed by disabling SSL
though.

On Tue, Sep 11, 2018 at 2:35 PM Guang Yang <k...@uber.com> wrote:

> Any idea guys?
>
> On Mon, Sep 10, 2018 at 3:07 PM, Guang Yang <k...@uber.com> wrote:
>
>> Thanks guys! The issue seems exactly what David pointed out, which is
>> because of encrypted over SSL.
>>
>> Without Knox, the download speed can reach to *400M/s* if I call
>> Namenode directly. And with disabling SSL, the speed can reach to
>> *~400M/s* as well through Knox. But with SSL, the speed drops
>> significantly to *~40M/s*. I know it's because of encrypted, but it does
>> surprised me with such a difference. Is it normal from your perspective?
>>
>> Thanks,
>> Guang
>>
>> On Tue, Sep 4, 2018 at 11:07 AM, David Villarreal <
>> dvillarr...@hortonworks.com> wrote:
>>
>>> Hi Guang,
>>>
>>>
>>>
>>> Keep in mind the data is being encrypted over SSL.  If you disable SSL
>>> you will most likely see a very significant boost in throughput.  Some
>>> people have used more powerful computers to make encryption quicker.
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> David
>>>
>>>
>>>
>>> *From: *Sean Roberts <srobe...@hortonworks.com>
>>> *Reply-To: *"user@knox.apache.org" <user@knox.apache.org>
>>> *Date: *Tuesday, September 4, 2018 at 1:53 AM
>>> *To: *"user@knox.apache.org" <user@knox.apache.org>
>>> *Subject: *Re: WebHDFS performance issue in Knox
>>>
>>>
>>>
>>> Guang – This is somewhat to be expected.
>>>
>>>
>>>
>>> When you talk to WebHDFS directly, the client can distribute the request
>>> across many data nodes. Also, you are getting data directly from the source.
>>>
>>> With Knox, all traffic goes through the single Knox host. Knox is
>>> responsible for fetching from the datanodes and consolidating to send to
>>> you. This means overhead as it’s acting as a middle man, and lower network
>>> capacity since only 1 host is serving data to you.
>>>
>>>
>>>
>>> Also, if running on a cloud provider, the Knox host may be a smaller
>>> instance size with lower network capacity.
>>>
>>> --
>>>
>>> Sean Roberts
>>>
>>>
>>>
>>> *From: *Guang Yang <k...@uber.com>
>>> *Reply-To: *"user@knox.apache.org" <user@knox.apache.org>
>>> *Date: *Tuesday, 4 September 2018 at 07:46
>>> *To: *"user@knox.apache.org" <user@knox.apache.org>
>>> *Subject: *WebHDFS performance issue in Knox
>>>
>>>
>>>
>>> Hi,
>>>
>>>
>>>
>>> We're using Knox 1.1.0 to proxy WebHDFS request. If we download a file
>>> through WebHDFS in Knox, the download speed is just about 11M/s. However,
>>> if we download directly from datanode, the speed is about 40M/s at least.
>>>
>>>
>>>
>>> Are you guys aware of this problem? Any suggestion?
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Guang
>>>
>>
>>
>

Reply via email to