So I was able to reproduce a slowdown with SSL with a pseudo distributed
HDFS setup on a single node with Knox running on the same node. This was
setup in Virtualbox on my laptop.

Rough timings with wget for a 1GB random file:

   - directly to webhdfs - 1,073,741,824  252MB/s   in 3.8s
   - knox no ssl - 1,073,741,824  264MB/s   in 3.6s
   - knox ssl - 1,073,741,824 54.3MB/s   in 20s

There is a significant decrease with Knox SSL for some reason.

Kevin Risden


On Sun, Sep 23, 2018 at 8:53 PM larry mccay <lmc...@apache.org> wrote:

> SSL handshake will likely happen at least twice.
> Once for the request through Knox to the NN then the redirect from the NN
> to the DN goes all the way back to the client.
> So they have to follow the redirect and do the handshake to the DN.
>
>
> On Sun, Sep 23, 2018 at 8:30 PM Kevin Risden <kris...@apache.org> wrote:
>
>> So I found this in the Knox issues list in JIRA:
>>
>> https://issues.apache.org/jira/browse/KNOX-1221
>>
>> It sounds familiar in terms of a slowdown when going through Knox.
>>
>> Kevin Risden
>>
>>
>> On Sat, Sep 15, 2018 at 10:17 PM Kevin Risden <kris...@apache.org> wrote:
>>
>>> Hmmm yea curl for a single file should do the handshake once.
>>>
>>> What are the system performance statistics during the SSL vs non SSL
>>> testing? CPU/memory/disk/etc? Ambari metrics with Grafana would help here
>>> if using that. Otherwise watching top may be helpful. It would be help to
>>> determine if the Knox is working harder during the SSL transfer.
>>>
>>> Kevin Risden
>>>
>>>
>>> On Wed, Sep 12, 2018 at 2:52 PM Guang Yang <k...@uber.com> wrote:
>>>
>>>> I'm just using curl to download a single large file. So I suspect SSL
>>>> handshake just happens once?
>>>>
>>>> On Tue, Sep 11, 2018 at 12:02 PM
>>>> Kevin Risden
>>>> <kris...@apache.org> wrote:
>>>>
>>>>> What client are you using to connect Knox? Is this for a single file
>>>>> or a bunch of files?
>>>>>
>>>>> The SSL handshake can be slow if the client doesn't keep the
>>>>> connection open.
>>>>>
>>>>> Kevin Risden
>>>>>
>>>>> On Tue, Sep 11, 2018, 14:51 Guang Yang <k...@uber.com> wrote:
>>>>>
>>>>>> Thanks Larry. But the only difference is this part in my
>>>>>> gateway-site.xml.
>>>>>>
>>>>>> *<property>*
>>>>>> *        <name>ssl.enabled</name>*
>>>>>> *        <value>false</value>*
>>>>>> *        <description>Indicates whether SSL is enabled.</description>*
>>>>>> *</property>*
>>>>>>
>>>>>> On Tue, Sep 11, 2018 at 11:42 AM, larry mccay <lmc...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> I really don't think that kind of difference should be expected from
>>>>>>> merely SSL overhead.
>>>>>>> I don't however have any metrics to contradict it either since I do
>>>>>>> not run Knox without SSL.
>>>>>>>
>>>>>>> Given the above, I am struggling coming up with a meaningful
>>>>>>> response to this. :(
>>>>>>> I don't think you should see a 10 fold increase in speed by
>>>>>>> disabling SSL though.
>>>>>>>
>>>>>>> On Tue, Sep 11, 2018 at 2:35 PM Guang Yang <k...@uber.com> wrote:
>>>>>>>
>>>>>>>> Any idea guys?
>>>>>>>>
>>>>>>>> On Mon, Sep 10, 2018 at 3:07 PM, Guang Yang <k...@uber.com> wrote:
>>>>>>>>
>>>>>>>>> Thanks guys! The issue seems exactly what David pointed out, which
>>>>>>>>> is because of encrypted over SSL.
>>>>>>>>>
>>>>>>>>> Without Knox, the download speed can reach to *400M/s* if I call
>>>>>>>>> Namenode directly. And with disabling SSL, the speed can reach to
>>>>>>>>> *~400M/s* as well through Knox. But with SSL, the speed drops
>>>>>>>>> significantly to *~40M/s*. I know it's because of encrypted, but
>>>>>>>>> it does surprised me with such a difference. Is it normal from your
>>>>>>>>> perspective?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Guang
>>>>>>>>>
>>>>>>>>> On Tue, Sep 4, 2018 at 11:07 AM, David Villarreal <
>>>>>>>>> dvillarr...@hortonworks.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Guang,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Keep in mind the data is being encrypted over SSL.  If you
>>>>>>>>>> disable SSL you will most likely see a very significant boost in
>>>>>>>>>> throughput.  Some people have used more powerful computers to make
>>>>>>>>>> encryption quicker.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *From: *Sean Roberts <srobe...@hortonworks.com>
>>>>>>>>>> *Reply-To: *"user@knox.apache.org" <user@knox.apache.org>
>>>>>>>>>> *Date: *Tuesday, September 4, 2018 at 1:53 AM
>>>>>>>>>> *To: *"user@knox.apache.org" <user@knox.apache.org>
>>>>>>>>>> *Subject: *Re: WebHDFS performance issue in Knox
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Guang – This is somewhat to be expected.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> When you talk to WebHDFS directly, the client can distribute the
>>>>>>>>>> request across many data nodes. Also, you are getting data directly 
>>>>>>>>>> from
>>>>>>>>>> the source.
>>>>>>>>>>
>>>>>>>>>> With Knox, all traffic goes through the single Knox host. Knox is
>>>>>>>>>> responsible for fetching from the datanodes and consolidating to 
>>>>>>>>>> send to
>>>>>>>>>> you. This means overhead as it’s acting as a middle man, and lower 
>>>>>>>>>> network
>>>>>>>>>> capacity since only 1 host is serving data to you.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Also, if running on a cloud provider, the Knox host may be a
>>>>>>>>>> smaller instance size with lower network capacity.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Sean Roberts
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *From: *Guang Yang <k...@uber.com>
>>>>>>>>>> *Reply-To: *"user@knox.apache.org" <user@knox.apache.org>
>>>>>>>>>> *Date: *Tuesday, 4 September 2018 at 07:46
>>>>>>>>>> *To: *"user@knox.apache.org" <user@knox.apache.org>
>>>>>>>>>> *Subject: *WebHDFS performance issue in Knox
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> We're using Knox 1.1.0 to proxy WebHDFS request. If we download a
>>>>>>>>>> file through WebHDFS in Knox, the download speed is just about 11M/s.
>>>>>>>>>> However, if we download directly from datanode, the speed is about 
>>>>>>>>>> 40M/s at
>>>>>>>>>> least.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Are you guys aware of this problem? Any suggestion?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Guang
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>

Reply via email to