SSL handshake will likely happen at least twice.
Once for the request through Knox to the NN then the redirect from the NN
to the DN goes all the way back to the client.
So they have to follow the redirect and do the handshake to the DN.


On Sun, Sep 23, 2018 at 8:30 PM Kevin Risden <kris...@apache.org> wrote:

> So I found this in the Knox issues list in JIRA:
>
> https://issues.apache.org/jira/browse/KNOX-1221
>
> It sounds familiar in terms of a slowdown when going through Knox.
>
> Kevin Risden
>
>
> On Sat, Sep 15, 2018 at 10:17 PM Kevin Risden <kris...@apache.org> wrote:
>
>> Hmmm yea curl for a single file should do the handshake once.
>>
>> What are the system performance statistics during the SSL vs non SSL
>> testing? CPU/memory/disk/etc? Ambari metrics with Grafana would help here
>> if using that. Otherwise watching top may be helpful. It would be help to
>> determine if the Knox is working harder during the SSL transfer.
>>
>> Kevin Risden
>>
>>
>> On Wed, Sep 12, 2018 at 2:52 PM Guang Yang <k...@uber.com> wrote:
>>
>>> I'm just using curl to download a single large file. So I suspect SSL
>>> handshake just happens once?
>>>
>>> On Tue, Sep 11, 2018 at 12:02 PM
>>> Kevin Risden
>>> <kris...@apache.org> wrote:
>>>
>>>> What client are you using to connect Knox? Is this for a single file or
>>>> a bunch of files?
>>>>
>>>> The SSL handshake can be slow if the client doesn't keep the connection
>>>> open.
>>>>
>>>> Kevin Risden
>>>>
>>>> On Tue, Sep 11, 2018, 14:51 Guang Yang <k...@uber.com> wrote:
>>>>
>>>>> Thanks Larry. But the only difference is this part in my
>>>>> gateway-site.xml.
>>>>>
>>>>> *<property>*
>>>>> *        <name>ssl.enabled</name>*
>>>>> *        <value>false</value>*
>>>>> *        <description>Indicates whether SSL is enabled.</description>*
>>>>> *</property>*
>>>>>
>>>>> On Tue, Sep 11, 2018 at 11:42 AM, larry mccay <lmc...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> I really don't think that kind of difference should be expected from
>>>>>> merely SSL overhead.
>>>>>> I don't however have any metrics to contradict it either since I do
>>>>>> not run Knox without SSL.
>>>>>>
>>>>>> Given the above, I am struggling coming up with a meaningful response
>>>>>> to this. :(
>>>>>> I don't think you should see a 10 fold increase in speed by disabling
>>>>>> SSL though.
>>>>>>
>>>>>> On Tue, Sep 11, 2018 at 2:35 PM Guang Yang <k...@uber.com> wrote:
>>>>>>
>>>>>>> Any idea guys?
>>>>>>>
>>>>>>> On Mon, Sep 10, 2018 at 3:07 PM, Guang Yang <k...@uber.com> wrote:
>>>>>>>
>>>>>>>> Thanks guys! The issue seems exactly what David pointed out, which
>>>>>>>> is because of encrypted over SSL.
>>>>>>>>
>>>>>>>> Without Knox, the download speed can reach to *400M/s* if I call
>>>>>>>> Namenode directly. And with disabling SSL, the speed can reach to
>>>>>>>> *~400M/s* as well through Knox. But with SSL, the speed drops
>>>>>>>> significantly to *~40M/s*. I know it's because of encrypted, but
>>>>>>>> it does surprised me with such a difference. Is it normal from your
>>>>>>>> perspective?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Guang
>>>>>>>>
>>>>>>>> On Tue, Sep 4, 2018 at 11:07 AM, David Villarreal <
>>>>>>>> dvillarr...@hortonworks.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Guang,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Keep in mind the data is being encrypted over SSL.  If you disable
>>>>>>>>> SSL you will most likely see a very significant boost in throughput.  
>>>>>>>>> Some
>>>>>>>>> people have used more powerful computers to make encryption quicker.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *From: *Sean Roberts <srobe...@hortonworks.com>
>>>>>>>>> *Reply-To: *"user@knox.apache.org" <user@knox.apache.org>
>>>>>>>>> *Date: *Tuesday, September 4, 2018 at 1:53 AM
>>>>>>>>> *To: *"user@knox.apache.org" <user@knox.apache.org>
>>>>>>>>> *Subject: *Re: WebHDFS performance issue in Knox
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Guang – This is somewhat to be expected.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> When you talk to WebHDFS directly, the client can distribute the
>>>>>>>>> request across many data nodes. Also, you are getting data directly 
>>>>>>>>> from
>>>>>>>>> the source.
>>>>>>>>>
>>>>>>>>> With Knox, all traffic goes through the single Knox host. Knox is
>>>>>>>>> responsible for fetching from the datanodes and consolidating to send 
>>>>>>>>> to
>>>>>>>>> you. This means overhead as it’s acting as a middle man, and lower 
>>>>>>>>> network
>>>>>>>>> capacity since only 1 host is serving data to you.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Also, if running on a cloud provider, the Knox host may be a
>>>>>>>>> smaller instance size with lower network capacity.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Sean Roberts
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *From: *Guang Yang <k...@uber.com>
>>>>>>>>> *Reply-To: *"user@knox.apache.org" <user@knox.apache.org>
>>>>>>>>> *Date: *Tuesday, 4 September 2018 at 07:46
>>>>>>>>> *To: *"user@knox.apache.org" <user@knox.apache.org>
>>>>>>>>> *Subject: *WebHDFS performance issue in Knox
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> We're using Knox 1.1.0 to proxy WebHDFS request. If we download a
>>>>>>>>> file through WebHDFS in Knox, the download speed is just about 11M/s.
>>>>>>>>> However, if we download directly from datanode, the speed is about 
>>>>>>>>> 40M/s at
>>>>>>>>> least.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Are you guys aware of this problem? Any suggestion?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Guang
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>

Reply via email to