So I found this in the Knox issues list in JIRA: https://issues.apache.org/jira/browse/KNOX-1221
It sounds familiar in terms of a slowdown when going through Knox. Kevin Risden On Sat, Sep 15, 2018 at 10:17 PM Kevin Risden <kris...@apache.org> wrote: > Hmmm yea curl for a single file should do the handshake once. > > What are the system performance statistics during the SSL vs non SSL > testing? CPU/memory/disk/etc? Ambari metrics with Grafana would help here > if using that. Otherwise watching top may be helpful. It would be help to > determine if the Knox is working harder during the SSL transfer. > > Kevin Risden > > > On Wed, Sep 12, 2018 at 2:52 PM Guang Yang <k...@uber.com> wrote: > >> I'm just using curl to download a single large file. So I suspect SSL >> handshake just happens once? >> >> On Tue, Sep 11, 2018 at 12:02 PM >> Kevin Risden >> <kris...@apache.org> wrote: >> >>> What client are you using to connect Knox? Is this for a single file or >>> a bunch of files? >>> >>> The SSL handshake can be slow if the client doesn't keep the connection >>> open. >>> >>> Kevin Risden >>> >>> On Tue, Sep 11, 2018, 14:51 Guang Yang <k...@uber.com> wrote: >>> >>>> Thanks Larry. But the only difference is this part in my >>>> gateway-site.xml. >>>> >>>> *<property>* >>>> * <name>ssl.enabled</name>* >>>> * <value>false</value>* >>>> * <description>Indicates whether SSL is enabled.</description>* >>>> *</property>* >>>> >>>> On Tue, Sep 11, 2018 at 11:42 AM, larry mccay <lmc...@apache.org> >>>> wrote: >>>> >>>>> I really don't think that kind of difference should be expected from >>>>> merely SSL overhead. >>>>> I don't however have any metrics to contradict it either since I do >>>>> not run Knox without SSL. >>>>> >>>>> Given the above, I am struggling coming up with a meaningful response >>>>> to this. :( >>>>> I don't think you should see a 10 fold increase in speed by disabling >>>>> SSL though. >>>>> >>>>> On Tue, Sep 11, 2018 at 2:35 PM Guang Yang <k...@uber.com> wrote: >>>>> >>>>>> Any idea guys? >>>>>> >>>>>> On Mon, Sep 10, 2018 at 3:07 PM, Guang Yang <k...@uber.com> wrote: >>>>>> >>>>>>> Thanks guys! The issue seems exactly what David pointed out, which >>>>>>> is because of encrypted over SSL. >>>>>>> >>>>>>> Without Knox, the download speed can reach to *400M/s* if I call >>>>>>> Namenode directly. And with disabling SSL, the speed can reach to >>>>>>> *~400M/s* as well through Knox. But with SSL, the speed drops >>>>>>> significantly to *~40M/s*. I know it's because of encrypted, but it >>>>>>> does surprised me with such a difference. Is it normal from your >>>>>>> perspective? >>>>>>> >>>>>>> Thanks, >>>>>>> Guang >>>>>>> >>>>>>> On Tue, Sep 4, 2018 at 11:07 AM, David Villarreal < >>>>>>> dvillarr...@hortonworks.com> wrote: >>>>>>> >>>>>>>> Hi Guang, >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Keep in mind the data is being encrypted over SSL. If you disable >>>>>>>> SSL you will most likely see a very significant boost in throughput. >>>>>>>> Some >>>>>>>> people have used more powerful computers to make encryption quicker. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *From: *Sean Roberts <srobe...@hortonworks.com> >>>>>>>> *Reply-To: *"user@knox.apache.org" <user@knox.apache.org> >>>>>>>> *Date: *Tuesday, September 4, 2018 at 1:53 AM >>>>>>>> *To: *"user@knox.apache.org" <user@knox.apache.org> >>>>>>>> *Subject: *Re: WebHDFS performance issue in Knox >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Guang – This is somewhat to be expected. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> When you talk to WebHDFS directly, the client can distribute the >>>>>>>> request across many data nodes. Also, you are getting data directly >>>>>>>> from >>>>>>>> the source. >>>>>>>> >>>>>>>> With Knox, all traffic goes through the single Knox host. Knox is >>>>>>>> responsible for fetching from the datanodes and consolidating to send >>>>>>>> to >>>>>>>> you. This means overhead as it’s acting as a middle man, and lower >>>>>>>> network >>>>>>>> capacity since only 1 host is serving data to you. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Also, if running on a cloud provider, the Knox host may be a >>>>>>>> smaller instance size with lower network capacity. >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Sean Roberts >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *From: *Guang Yang <k...@uber.com> >>>>>>>> *Reply-To: *"user@knox.apache.org" <user@knox.apache.org> >>>>>>>> *Date: *Tuesday, 4 September 2018 at 07:46 >>>>>>>> *To: *"user@knox.apache.org" <user@knox.apache.org> >>>>>>>> *Subject: *WebHDFS performance issue in Knox >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> We're using Knox 1.1.0 to proxy WebHDFS request. If we download a >>>>>>>> file through WebHDFS in Knox, the download speed is just about 11M/s. >>>>>>>> However, if we download directly from datanode, the speed is about >>>>>>>> 40M/s at >>>>>>>> least. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Are you guys aware of this problem? Any suggestion? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Guang >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>