So I found this in the Knox issues list in JIRA:

https://issues.apache.org/jira/browse/KNOX-1221

It sounds familiar in terms of a slowdown when going through Knox.

Kevin Risden


On Sat, Sep 15, 2018 at 10:17 PM Kevin Risden <kris...@apache.org> wrote:

> Hmmm yea curl for a single file should do the handshake once.
>
> What are the system performance statistics during the SSL vs non SSL
> testing? CPU/memory/disk/etc? Ambari metrics with Grafana would help here
> if using that. Otherwise watching top may be helpful. It would be help to
> determine if the Knox is working harder during the SSL transfer.
>
> Kevin Risden
>
>
> On Wed, Sep 12, 2018 at 2:52 PM Guang Yang <k...@uber.com> wrote:
>
>> I'm just using curl to download a single large file. So I suspect SSL
>> handshake just happens once?
>>
>> On Tue, Sep 11, 2018 at 12:02 PM
>> Kevin Risden
>> <kris...@apache.org> wrote:
>>
>>> What client are you using to connect Knox? Is this for a single file or
>>> a bunch of files?
>>>
>>> The SSL handshake can be slow if the client doesn't keep the connection
>>> open.
>>>
>>> Kevin Risden
>>>
>>> On Tue, Sep 11, 2018, 14:51 Guang Yang <k...@uber.com> wrote:
>>>
>>>> Thanks Larry. But the only difference is this part in my
>>>> gateway-site.xml.
>>>>
>>>> *<property>*
>>>> *        <name>ssl.enabled</name>*
>>>> *        <value>false</value>*
>>>> *        <description>Indicates whether SSL is enabled.</description>*
>>>> *</property>*
>>>>
>>>> On Tue, Sep 11, 2018 at 11:42 AM, larry mccay <lmc...@apache.org>
>>>> wrote:
>>>>
>>>>> I really don't think that kind of difference should be expected from
>>>>> merely SSL overhead.
>>>>> I don't however have any metrics to contradict it either since I do
>>>>> not run Knox without SSL.
>>>>>
>>>>> Given the above, I am struggling coming up with a meaningful response
>>>>> to this. :(
>>>>> I don't think you should see a 10 fold increase in speed by disabling
>>>>> SSL though.
>>>>>
>>>>> On Tue, Sep 11, 2018 at 2:35 PM Guang Yang <k...@uber.com> wrote:
>>>>>
>>>>>> Any idea guys?
>>>>>>
>>>>>> On Mon, Sep 10, 2018 at 3:07 PM, Guang Yang <k...@uber.com> wrote:
>>>>>>
>>>>>>> Thanks guys! The issue seems exactly what David pointed out, which
>>>>>>> is because of encrypted over SSL.
>>>>>>>
>>>>>>> Without Knox, the download speed can reach to *400M/s* if I call
>>>>>>> Namenode directly. And with disabling SSL, the speed can reach to
>>>>>>> *~400M/s* as well through Knox. But with SSL, the speed drops
>>>>>>> significantly to *~40M/s*. I know it's because of encrypted, but it
>>>>>>> does surprised me with such a difference. Is it normal from your
>>>>>>> perspective?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Guang
>>>>>>>
>>>>>>> On Tue, Sep 4, 2018 at 11:07 AM, David Villarreal <
>>>>>>> dvillarr...@hortonworks.com> wrote:
>>>>>>>
>>>>>>>> Hi Guang,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Keep in mind the data is being encrypted over SSL.  If you disable
>>>>>>>> SSL you will most likely see a very significant boost in throughput.  
>>>>>>>> Some
>>>>>>>> people have used more powerful computers to make encryption quicker.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *From: *Sean Roberts <srobe...@hortonworks.com>
>>>>>>>> *Reply-To: *"user@knox.apache.org" <user@knox.apache.org>
>>>>>>>> *Date: *Tuesday, September 4, 2018 at 1:53 AM
>>>>>>>> *To: *"user@knox.apache.org" <user@knox.apache.org>
>>>>>>>> *Subject: *Re: WebHDFS performance issue in Knox
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Guang – This is somewhat to be expected.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> When you talk to WebHDFS directly, the client can distribute the
>>>>>>>> request across many data nodes. Also, you are getting data directly 
>>>>>>>> from
>>>>>>>> the source.
>>>>>>>>
>>>>>>>> With Knox, all traffic goes through the single Knox host. Knox is
>>>>>>>> responsible for fetching from the datanodes and consolidating to send 
>>>>>>>> to
>>>>>>>> you. This means overhead as it’s acting as a middle man, and lower 
>>>>>>>> network
>>>>>>>> capacity since only 1 host is serving data to you.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Also, if running on a cloud provider, the Knox host may be a
>>>>>>>> smaller instance size with lower network capacity.
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Sean Roberts
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *From: *Guang Yang <k...@uber.com>
>>>>>>>> *Reply-To: *"user@knox.apache.org" <user@knox.apache.org>
>>>>>>>> *Date: *Tuesday, 4 September 2018 at 07:46
>>>>>>>> *To: *"user@knox.apache.org" <user@knox.apache.org>
>>>>>>>> *Subject: *WebHDFS performance issue in Knox
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> We're using Knox 1.1.0 to proxy WebHDFS request. If we download a
>>>>>>>> file through WebHDFS in Knox, the download speed is just about 11M/s.
>>>>>>>> However, if we download directly from datanode, the speed is about 
>>>>>>>> 40M/s at
>>>>>>>> least.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Are you guys aware of this problem? Any suggestion?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Guang
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>

Reply via email to